how to filter sum types in Haskell

how to filter sum types in Haskell - haskell

for example
data CampingStuff = Apple String Int
| Banana String Int
| Pineapple String Int
| Table String Int
| Chairs String Int
I want to have a query function
pickStuff :: [CampingStuff] -> ??? -> [CampingStuff]
the ??? I want to pass Apple then the pickStuff is going to filter out all stuffs like
Apple "Jane" 3
Apple "Jack" 5
Apple "Lucy" 6
something I can think of is like
pickStuffs stuffs dummyStuff
= filter
(\x ->
(x == dummyStuff)
stuffs
pickStuffs stuffs (Apple "" 0)
instance Eq CampingStuff where
compare (Apple name1 number1) (Apple name2 number2)
= True
the drawback of it is :
passing extra parameters to dummy value is not elegant and is not making any sense "" 0
it has to implement all the value constructor in Eq type class ( Apple, Table , Chair)
it is not scalable as in the future I would like to filter out all the apples from Janes
like this (Apple "Jane" _)
Thank you for reading this and appreciate any help how to filter on this [CampingStuff] by Data Constructor like Apple/Table ?

The problem is that unsaturated constructors can't really be compared by value. The only things you can do with them are invoke them or pattern match on them. So if you want a function that tests for Apple, it'll have to be totally different from a function that tests for Banana - they can't share any code, because they have to compare against a different set of patterns.
This is all much easier if you refactor your type to remove the obvious duplication, leaving you with saturated value constructors. The generated Eq instance is all you'll need for comparing types:
data StuffType = Apple | Banana | Pineapple | Table | Chairs deriving Eq
data CampingStuff = Stuff { stuffType :: StuffType
, owner :: String
, quantity :: Int
}
Then you can easily write a function of type CampingStuff -> Bool by composing a couple functions.
hasType :: StuffType -> CampingStuff -> Bool
hasType t s = stuffType s == t
and use that to filter a list:
pickStuff :: StuffType -> [CampingStuff] -> [CampingStuff]
pickStuff = filter . hasType
In the comments, you ask: What if my constructors weren't all uniform, so I couldn't extract everything out to a product type with an enum in it?
I argue that, in such a case, you won't be happy with the result of a pickStuff function no matter how it's implemented. Let's imagine a simpler type:
data Color = Red | Green
data Light = Off | On Color
Now, you might wish to filter a [Light] such that it includes only lights that are On, regardless of their color. Fine, we can implement that. We won't even worry about generalizing, because the type is so small:
ons :: [Light] -> [Light]
ons = filter on
where on Off = False
on (On _) = True
Now you have lights :: [Light], and you can get onLights = ons lights :: [Light]. Amazing. What will you do with onLights next? Perhaps you want to count how many of each color there are:
import qualified Data.Map as M
colorCounts :: [Light] -> M.Map Color Int
colorCounts = M.fromListWith (+) . map getColor
where getColor (On c) = (c, 1)
colorCounts has a problem: it assumes all the lights are On, but there's no guarantee of that in the type system. So you can accidentally call colorCounts ls instead of colorCounts (ons ls), and it will compile, but give you an error at runtime.
Better would be to just do your pattern matching at the point when you'll know what to do with the results. Here, that's inside of colorCounts: just add a case for Off, and use mapMaybe instead of map so you have a chance to throw away values you don't like:
colorCounts' :: [Light] -> M.Map Color Int
colorCounts' = M.fromListWith (+) . mapMabye getColor
where getColor (On c) = Just (c, 1)
getColor Off = Nothing
The same arguments all hold for more complicated types: don't pattern match on a value until you're ready to handle all the information you might find.
Of course, one way to handle such information is to put it into a new type that contains only the information you want. So you could very well write a function
colorsOfOnLights :: [Light] -> [Color]
colorsOfOnLights = mapMaybe go
where go Off = Nothing
go (On c) = Just c
This way, you can't possibly mix up the input of the "filter" function with the output: the output is clearly divorced from the original Light type, and its values can only have come from the On lights. You can do the same thing for your CampingStuff type by extracting a new product type for each of the constructors:
data CampingStuff = Apple AppleData
| Banana BananaData
-- ...
data AppleData = AppleData String Int Bool
data BananaData = BananaData String Int
-- ...
asApple :: CampingStuff -> Maybe AppleData
asApple (Apple ad) = Just ad
asApple _ = Nothing
apples :: [CampingStuff] -> [AppleData]
apples = mapMaybe asApple
You'll need separate functions for asApple and for asBanana and so on. This seems cumbersome, and I don't exactly disagree, but in practice people don't really need large numbers of functions like this. It's usually better to do as I described before: delay the pattern match until you know what to do with the results.

For the function you want to have, you could create functions such as
isApple :: CampingStuff -> Bool
isApple Apple{} = True
isApple _ = False
and then use filter isApple. When you want to filter by Jane, you add another 5 functions for each type, like isAppleFrom :: String -> CampingStuff -> Bool and do filter (isAppleFrom "Jane").
Another approach is the following:
data StuffType = AppleT | BananaT | PineappleT | TableT | ChairsT deriving Eq
data Query = ByStuff StuffType | ByName String deriving Eq
pickStuff :: [CampingStuff] -> [Query] -> [CampingStuff]
pickStuff xs qs = filter cond xs
where
cond :: CampingStuff -> Bool
cond x = all (\q -> case (q, x) of
(ByStuff AppleT, Apple{}) -> True
...other pairs...
(ByName name1, Apple name2 _) -> name1 == name2
...
_ -> False) qs
That is, separate querying from the data types. The above is an example and may be written better.

Related

Haskell Type Destructuring in Guards

I'm playing around with a toy project in Haskell. I'm implementing some data structures I've built in other languages before to familiarize myself with how they'd be built in Haskell. This isn't my first functional language, I've built a couple of projects like a Scheme interpreter in OCaml but I think my OCaml experience is coloring how I'm approaching this problem. Its' not terribly important, but may be useful for context, to know that the data structure I'm implementing is a PR-Quadtree.
What I want to do is match and destructure a type inside a guard, a la OCaml's match statement.
data Waypoint = WayPoint {
lat :: Float,
lon :: Float,
radius :: Float,
speed :: Float,
accel :: Float
} deriving (Show)
data Region = Region {
x :: Float,
y :: Float,
width :: Float
} deriving (Show)
data PRQuadtree = WhiteNode Region
| BlackNode Region Waypoint
| GreyNode {
topLeft :: PRQuadtree,
topRight :: PRQuadtree,
botLeft :: PRQuadtree,
botRight :: PRQuadtree,
region :: Region
} deriving (Show)
getRegion node
| BlackNode(r, _) = r
| WhiteNode(r) = r
| GreyNode = region node
The getRegion function is the one I am having problems with in particular. In case what I'm trying to do is unclear: I'd like to simple extract one element of the argument but that depends on which member of the algebraic data type the argument is. In OCaml I could do:
let getRegion node = match node with
| BlackNode(r, _) = r
| WhiteNode(r) = r
| GreyNode = region(node)
(or something very similar, my OCaml is a bit rusty now).
In Haskell however, this doesn't appear to bind r in scope of the RHS of the guard branch. I've tried to look up Pattern Guards, as they sound similar to what I might want to do, but I can't really grok whats going on here. Really I just want to get a binding from the LHS of the = to the RHS of the equals (depending on which branch of the guard we've gone down).
Whats the idiomatic Haskell way to do what I'm trying to do here?

It can be achieved as follows:
getRegion :: PRQuadtree -> Region
getRegion (BlackNode r _) = r
getRegion (WhiteNode r) = r
getRegion GreyNode{region=r} = r
or even as
getRegion :: PRQuadtree -> Region
getRegion x = case x of
BlackNode r _ -> r
WhiteNode r -> r
GreyNode{} -> region x
In Haskell, prepending a type signature is very idiomatic.
Another option is extending the region field to the other cases as well:
data PRQuadtree = WhiteNode { region :: Region }
| BlackNode { region :: Region , waypoint :: Waypoint }
| GreyNode {
topLeft :: PRQuadtree,
topRight :: PRQuadtree,
botLeft :: PRQuadtree,
botRight :: PRQuadtree,
region :: Region
} deriving (Show)
Now, region will work on all PRQuadtree values.
Haskell uses | as ML does when defining algebraic datatypes, to separate different constructors, but does not use it to separate case branches, which instead follow the syntax
case .. of { pat1 -> e1 ; pat2 -> e2 ; ... }
which can be replaced by indentation
case .. of
pat1 -> e1
pat2 -> e2
...
Also, note that partial field selectors are discouraged:
data A = A1 { foo :: Int } | A2
Above, foo A2 type checks but crashes. On the other hand, when a field is present in all the constructors, we do not face such risk.

You can also write:
getRegion x
| BlackNode y <- x -> ....
| Greynode{} <- x -> ....
but it is quite unidiomatic in this simple case.
However, in more complex programs, this pattern matching in guards can be very useful. You use multiple equations or case to distinguish the general cases, like shown by #chi. But then, you can detect special cases, like in the following made up example:
getRegion x = case x of
BlackNode{region}
| [(0,_)] <- filter (inRegion region) interestingPoints
-> -- region encloses exactly 1 interesting point on x axis
....
| otherwise = ....
where
interestingPoints = .....
inRegion :: Region -> Point -> Bool
GreyNode{} -> ....

data type with a default field and that needs a function that works with it

Say, I have a data type
data FooBar a = Foo String Char [a]
| Bar String Int [a]
I need to create values of this type and give empty list as the second field:
Foo "hello" 'a' []
or
Bar "world" 1 []
1) I do this everywhere in my code and I think it would be nice if I could omit the empty list part somehow and have the empty list assigned implicitly. Is this possible? Something similar to default function arguments in other languages.
2) Because of this [] "default" value, I often need to have a partial constructor application that results in a function that takes the first two values:
mkFoo x y = Foo x y []
mkBar x y = Bar x y []
Is there a "better" (more idiomatic, etc) way to do it? to avoid defining new functions?
3) I need a way to add things to the list:
add (Foo u v xs) x = Foo u v (x:xs)
add (Bar u v xs) x = Bar u v (x:xs)
Is this how it is done idiomatically? Just a general purpose function?
As you see I am a beginner, so maybe these questions make little sense. Hope not.

I'll address your questions one by one.
Default arguments do not exist in Haskell. They are simply not worth the added complexity and loss of compositionally. Being a functional language, you do a lot more function manipulation in Haskell, so funkiness like default arguments would be tough to handle.
One thing I didn't realize when I started Haskell is that data constructors are functions just like everything else. In your example,
Foo :: String -> Char -> [a] -> FooBar a
Thus you can write functions for filling in various arguments of other functions, and then those functions will work with Foo or Bar or whatever.
fill1 :: a -> (a -> b) -> b
fill1 a f = f a
--Note that fill1 = flip ($)
fill2 :: b -> (a -> b -> c) -> (a -> c)
--Equivalently, fill2 :: b -> (a -> b -> c) -> a -> c
fill2 b f = \a -> f a b
fill3 :: c -> (a -> b -> c -> d) -> (a -> b -> d)
fill3 c f = \a b -> f a b c
fill3Empty :: (a -> b -> [c] -> d) -> (a -> b -> d)
fill3Empty f = fill3 [] f
--Now, we can write
> fill3Empty Foo x y
Foo x y []
The lens package provides elegant solutions to questions like this. However, you can tell at a glance that this package is enormously complicated. Here is the net result of how you would call the lens package:
_list :: Lens (FooBar a) (FooBar b) [a] [b]
_list = lens getter setter
where getter (Foo _ _ as) = as
getter (Bar _ _ as) = as
setter (Foo s c _) bs = Foo s c bs
setter (Bar s i _) bs = Bar s i bs
Now we can do
> over _list (3:) (Foo "ab" 'c' [2,1])
Foo "ab" 'c' [3,2,1]
Some explanation: the lens function produces a Lens type when given a getter and a setter for some type. Lens s t a b is a type that says "s holds an a and t holds a b. Thus, if you give me a function a -> b, I can give you a function s -> t". That is exactly what over does: you provide it a lens and a function (in our case, (3:) was a function that adds 3 to the front of a List) and it applies the function "where the lens indicates". This is very similar to a functor, however, we have significantly more freedom (in this example, the functor instance would be obligated to change every element of the lists, not operate on the lists themselves).
Note that our new _list lens is very generic: it works equally well over Foo and Bar and the lens package provides many functions other than over for doing magical things.

The idiomatic thing is to take those parameters of a function or constructor that you commonly want to partially apply, and move them toward the beginning:
data FooBar a = Foo [a] String Char
| Bar [a] String Int
foo :: String -> Char -> FooBar a
foo = Foo []
bar :: String -> Int -> FooBar a
bar = Bar []
Similarly, reordering the parameters to add lets you partially apply add to get functions of type FooBar a -> FooBar a, which can be easily composed:
add :: a -> FooBar a -> FooBar a
add x (Foo xs u v) = Foo (x:xs) u v
add123 :: FooBar Int -> FooBar Int
add123 = add 1 . add 2 . add 3
add123 (foo "bar" 42) == Foo [1, 2, 3] "bar" 42

(2) and (3) are perfectly normal and idiomatic ways of doing such things. About (2) in particular, one expression you will occasionally hear is "smart constructor". That just means a function like your mkFoo/mkBar that produces a FooBar a (or a Maybe (FooBar a) etc.) with some extra logic to ensure only reasonable values can be constructed.
Here are some additional tricks that might (or might not!) make sense, depending on what you are trying to do with FooBar.
If you use Foo values and Barvalues in similar ways most of the time (i.e. the difference between having the Char field and the Int one is a minor detail), it makes sense to factor out the similarities and use a single constructor:
data FooBar a = FooBar String FooBarTag [a]
data FooBarTag = Foo Char | Bar Int
Beyond avoiding case analysis when you don't care about the FooBarTag, that allows you to safely use record syntax (records and types with multiple constructors do not mix well).
data FooBar a = FooBar
{ fooBarName :: String
, fooBarTag :: FooBarTag
, fooBarList :: [a]
}
Records allow you to use the fields without having to pattern match the whole thing.
If there are sensible defaults for all fields in a FooBar, you can go one step beyond mkFoo-like constructors and define a default value.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ fooBarName = ""
, fooBarTag = Bar 0
, fooBarList = []
}
You don't need records to use a default, but they allow overriding default fields conveniently.
myFooBar = defaultFooBar
{ fooBarTag = Foo 'x'
}
If you ever get tired of typing long names for the defaults over and over, consider the data-default package:
instance Default (FooBar a) where
def = defaultFooBar
myFooBar = def { fooBarTag = Foo 'x' }
Do note that a significant number of people do not like the Default class, and not without reason. Still, for types which are very specific to your application (e.g. configuration settings) Default is perfectly fine IMO.
Finally, updating record fields can be messy. If you end up annoyed by that, you will find lens very useful. Note that it is a big library, and it might be a little overwhelming to a beginner, so take a deep breath beforehand. Here is a small sample:
{-# LANGUAGE TemplateHaskell #-} -- At the top of the file. Needed for makeLenses.
import Control.Lens
-- Note the underscores.
-- If you are going to use lenses, it is sensible not to export the field names.
data FooBar a = FooBar
{ _fooBarName :: String
, _fooBarTag :: FooBarTag
, _fooBarList :: [a]
}
makeLenses ''FooBar -- Defines lenses for the fields automatically.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ _fooBarName = ""
, _fooBarTag = Bar 0
, _fooBarList = []
}
-- Using a lens (fooBarTag) to set a field without record syntax.
-- Note the lack of underscores in the name of the lens.
myFooBar = set fooBarTag (Foo 'x') defaultFooBar
-- Using a lens to access a field.
myTag = view fooBarTag myFooBar -- Results in Foo 'x'
-- Using a lens (fooBarList) to modify a field.
add :: a -> FooBar a -> FooBar a
add x fb = over fooBarList (x :) fb
-- set, view and over have operator equivalents, (.~). (^.) and (%~) respectively.
-- Note that (^.) is flipped with respect to view.
Here is a gentle introduction to lens which focuses on aspects I have not demonstrated here, specially in how nicely lenses can be composed.

Haskell type of

How can I find the type of a value in Haskell?
I want something like this:
data Vegetable =
Und Under
|Abv Above
is_vegetable ::a->Bool
is_vegetable a = if (a is of type Vegetable) then True else False
Update:
I want a datastructure to model the above tree.
I would also like to have some functions (is_drink, is_vegetable,is_wine,is_above) so that I can apply some filters on a list.

You don't. You rely on the type system to ensure that the value is a Vegetable --- if the value is not a Vegetable, your program won't compile, much less run.
is_vegetable :: Vegetable -> Bool
is_vegetable _ = True -- so there is not much point to this function
Edit, upon seeing your comment:
data Foodstuff = Vegetable Vegetable
| Drink Drink
is_vegetable :: Foodstuff -> Bool
is_vegetable (Vegetable _) = True
is_vegetable _ = False
But this is still probably not what you want. Instead you probably want something like
case myFood of
Vegetable vegetable -> -- something involving `vegetable`
Drink drink -> -- something involving `drink`

You cannot do this in Haskell. All function arguments have concrete types (like Int and String) or they are type variables (like the a in your example). Type variables can be restricted to belong to a certain type class.
When you use an unrestricted type variable, then you cannot do anything interesting with the values of that type. By restricting the type variable to a type class, you get more power: if you have Num a, then you know that a is a numeric type and so you can add, multiple, etc.
From your comment, it sounds like you need a (bigger) data type to hold the different types of elements in your tree. The Either a b type may come in handy here. It is either Left a or Right b and so you can have a function like
is_vegetable :: Either Vegetable Drink -> Bool
is_vegetable (Left _) = True
is_vegetable (Right _) = False
Your tree nodes will then be Either Vegetable Dring elements.

Tip for reading function signatures in Haskell:
f :: a -> Bool
This means f takes one argument which could be anything, and f does not have any information about the type. So it is impossible for f to know if the argument is a Vegetable. There are only three possible definitions for f (two more for strict / non-strict variants, which I'm omitting for clarity):
-- version 1
f _ = True
-- version 2
f _ = False
-- version 3
f _ = undefined
You see f is a very boring function because it is not allowed to know anything about its parameter. You could do something like this:
isVegetable :: Typeable a => a -> Bool
isVegetable x = case cast x :: Maybe Vegetable of
Just _ -> True
Nothing -> False
You would need to create an instance of Typeable for Vegetable,
data Vegetable = ... deriving Typeable
The signature f :: Typeable a => a -> Bool means that f has one parameter, and it does not know anything about that parameter except that the parameter has a type that is known at runtime.

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.

You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.

To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2

Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.

You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Haskell data structures oddity

I've been attempting to write a small file to try out a bag-like data structure. So far my code is as follows:
data Fruit = Apple | Banana | Pear deriving (Eq, Show)
data Bag a = EmptyBag | Contents [(a, Integer)]
emptyBag :: Bag a
emptyBag = EmptyBag
unwrap :: [a] -> a
unwrap [x] = x
isObject theObject (obj, inte) = theObject == obj
count :: Bag a -> a -> Integer
count (Contents [xs]) theObject = snd (unwrap (filter (isObject theObject) [xs]))
count EmptyBag _ = 0
But when I try and run it I get the error
Could not deduce (Eq a) from the context ()
arising from a use of 'isObject' at ....
Whereas when I take the count function out and call
snd(unwrap(filter (isObject Banana) [(Apple,1),(Banana,2)]))
it happily returns 2.
Any clues on why this is, or advice on writing this kind of data structure would be much appreciated.

(==) can only be used in a context that includes Eq, but when you declared count you didn't include that context. If I'm reading correctly, that would be
count :: Eq a => Bag a -> a -> Integer
If you declare count without including the type, you can ask ghci for the inferred type; or just ask for the inferred type of snd (unwrap (filter (isObject Banana) [(Apple,1),(Banana,2)]))

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

how to filter sum types in Haskell - haskell

Related

Haskell Type Destructuring in Guards

data type with a default field and that needs a function that works with it

Haskell type of

Sort by constructor ignoring (part of) value

Haskell data structures oddity

Categories

Resources