data type with a default field and that needs a function that works with it - haskell

Say, I have a data type
data FooBar a = Foo String Char [a]
| Bar String Int [a]
I need to create values of this type and give empty list as the second field:
Foo "hello" 'a' []
or
Bar "world" 1 []
1) I do this everywhere in my code and I think it would be nice if I could omit the empty list part somehow and have the empty list assigned implicitly. Is this possible? Something similar to default function arguments in other languages.
2) Because of this [] "default" value, I often need to have a partial constructor application that results in a function that takes the first two values:
mkFoo x y = Foo x y []
mkBar x y = Bar x y []
Is there a "better" (more idiomatic, etc) way to do it? to avoid defining new functions?
3) I need a way to add things to the list:
add (Foo u v xs) x = Foo u v (x:xs)
add (Bar u v xs) x = Bar u v (x:xs)
Is this how it is done idiomatically? Just a general purpose function?
As you see I am a beginner, so maybe these questions make little sense. Hope not.

I'll address your questions one by one.
Default arguments do not exist in Haskell. They are simply not worth the added complexity and loss of compositionally. Being a functional language, you do a lot more function manipulation in Haskell, so funkiness like default arguments would be tough to handle.
One thing I didn't realize when I started Haskell is that data constructors are functions just like everything else. In your example,
Foo :: String -> Char -> [a] -> FooBar a
Thus you can write functions for filling in various arguments of other functions, and then those functions will work with Foo or Bar or whatever.
fill1 :: a -> (a -> b) -> b
fill1 a f = f a
--Note that fill1 = flip ($)
fill2 :: b -> (a -> b -> c) -> (a -> c)
--Equivalently, fill2 :: b -> (a -> b -> c) -> a -> c
fill2 b f = \a -> f a b
fill3 :: c -> (a -> b -> c -> d) -> (a -> b -> d)
fill3 c f = \a b -> f a b c
fill3Empty :: (a -> b -> [c] -> d) -> (a -> b -> d)
fill3Empty f = fill3 [] f
--Now, we can write
> fill3Empty Foo x y
Foo x y []
The lens package provides elegant solutions to questions like this. However, you can tell at a glance that this package is enormously complicated. Here is the net result of how you would call the lens package:
_list :: Lens (FooBar a) (FooBar b) [a] [b]
_list = lens getter setter
where getter (Foo _ _ as) = as
getter (Bar _ _ as) = as
setter (Foo s c _) bs = Foo s c bs
setter (Bar s i _) bs = Bar s i bs
Now we can do
> over _list (3:) (Foo "ab" 'c' [2,1])
Foo "ab" 'c' [3,2,1]
Some explanation: the lens function produces a Lens type when given a getter and a setter for some type. Lens s t a b is a type that says "s holds an a and t holds a b. Thus, if you give me a function a -> b, I can give you a function s -> t". That is exactly what over does: you provide it a lens and a function (in our case, (3:) was a function that adds 3 to the front of a List) and it applies the function "where the lens indicates". This is very similar to a functor, however, we have significantly more freedom (in this example, the functor instance would be obligated to change every element of the lists, not operate on the lists themselves).
Note that our new _list lens is very generic: it works equally well over Foo and Bar and the lens package provides many functions other than over for doing magical things.

The idiomatic thing is to take those parameters of a function or constructor that you commonly want to partially apply, and move them toward the beginning:
data FooBar a = Foo [a] String Char
| Bar [a] String Int
foo :: String -> Char -> FooBar a
foo = Foo []
bar :: String -> Int -> FooBar a
bar = Bar []
Similarly, reordering the parameters to add lets you partially apply add to get functions of type FooBar a -> FooBar a, which can be easily composed:
add :: a -> FooBar a -> FooBar a
add x (Foo xs u v) = Foo (x:xs) u v
add123 :: FooBar Int -> FooBar Int
add123 = add 1 . add 2 . add 3
add123 (foo "bar" 42) == Foo [1, 2, 3] "bar" 42

(2) and (3) are perfectly normal and idiomatic ways of doing such things. About (2) in particular, one expression you will occasionally hear is "smart constructor". That just means a function like your mkFoo/mkBar that produces a FooBar a (or a Maybe (FooBar a) etc.) with some extra logic to ensure only reasonable values can be constructed.
Here are some additional tricks that might (or might not!) make sense, depending on what you are trying to do with FooBar.
If you use Foo values and Barvalues in similar ways most of the time (i.e. the difference between having the Char field and the Int one is a minor detail), it makes sense to factor out the similarities and use a single constructor:
data FooBar a = FooBar String FooBarTag [a]
data FooBarTag = Foo Char | Bar Int
Beyond avoiding case analysis when you don't care about the FooBarTag, that allows you to safely use record syntax (records and types with multiple constructors do not mix well).
data FooBar a = FooBar
{ fooBarName :: String
, fooBarTag :: FooBarTag
, fooBarList :: [a]
}
Records allow you to use the fields without having to pattern match the whole thing.
If there are sensible defaults for all fields in a FooBar, you can go one step beyond mkFoo-like constructors and define a default value.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ fooBarName = ""
, fooBarTag = Bar 0
, fooBarList = []
}
You don't need records to use a default, but they allow overriding default fields conveniently.
myFooBar = defaultFooBar
{ fooBarTag = Foo 'x'
}
If you ever get tired of typing long names for the defaults over and over, consider the data-default package:
instance Default (FooBar a) where
def = defaultFooBar
myFooBar = def { fooBarTag = Foo 'x' }
Do note that a significant number of people do not like the Default class, and not without reason. Still, for types which are very specific to your application (e.g. configuration settings) Default is perfectly fine IMO.
Finally, updating record fields can be messy. If you end up annoyed by that, you will find lens very useful. Note that it is a big library, and it might be a little overwhelming to a beginner, so take a deep breath beforehand. Here is a small sample:
{-# LANGUAGE TemplateHaskell #-} -- At the top of the file. Needed for makeLenses.
import Control.Lens
-- Note the underscores.
-- If you are going to use lenses, it is sensible not to export the field names.
data FooBar a = FooBar
{ _fooBarName :: String
, _fooBarTag :: FooBarTag
, _fooBarList :: [a]
}
makeLenses ''FooBar -- Defines lenses for the fields automatically.
defaultFooBar :: FooBar a
defaultFooBar = FooBar
{ _fooBarName = ""
, _fooBarTag = Bar 0
, _fooBarList = []
}
-- Using a lens (fooBarTag) to set a field without record syntax.
-- Note the lack of underscores in the name of the lens.
myFooBar = set fooBarTag (Foo 'x') defaultFooBar
-- Using a lens to access a field.
myTag = view fooBarTag myFooBar -- Results in Foo 'x'
-- Using a lens (fooBarList) to modify a field.
add :: a -> FooBar a -> FooBar a
add x fb = over fooBarList (x :) fb
-- set, view and over have operator equivalents, (.~). (^.) and (%~) respectively.
-- Note that (^.) is flipped with respect to view.
Here is a gentle introduction to lens which focuses on aspects I have not demonstrated here, specially in how nicely lenses can be composed.

Related

Point Free Style Required for Optimized Curry

Say we have a (contrived) function like so:
import Data.List (sort)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (sort a) ++ b
And we partially apply it to use elsewhere, eg:
map (contrived [3,2,1]) [[4],[5],[6]]
On the surface, this works as one would expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
However, if we throw some traces in:
import Debug.Trace (trace)
contrived :: Ord a => [a] -> [a] -> [a]
contrived a b = (trace "sorted" $ sort a) ++ b
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that the first list passed into contrived is evaluated only once, but it is sorted for each item in [4,5,6]:
[sorted
a value
[1,2,3,4],sorted
[1,2,3,5],sorted
[1,2,3,6]]
Now, contrived can be rather simply translated to point-free style:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (sort a)
Which when partially applied:
map (contrived [3,2,1]) [4,5,6]
Still works as we expect:
[[1,2,3,4],[1,2,3,5],[1,2,3,6]]
But if we again add traces:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = (++) (trace "sorted" $ sort a)
map (contrived $ trace "a value" [3,2,1]) [[4],[5],[6]]
We see that now the first list passed into contrived is evaluated and sorted only once:
[sorted
a value
[1,2,3,4],[1,2,3,5],[1,2,3,6]]
Why is this so? Since the translation into pointfree style is so trivial, why can't GHC deduce that it only needs to sort a once in the first version of contrived?
Note: I know that for this rather trivial example, it's probably preferable to use pointfree style. This is a contrived example that I've simplified quite a bit. The real function that I'm having the issue with is less clear (in my opinion) when expressed in pointfree style:
realFunction a b = conditionOne && conditionTwo
where conditionOne = map (something a) b
conditionTwo = somethingElse a b
In pointfree style, this requires writing an ugly wrapper (both) around (&&):
realFunction a = both conditionOne conditionTwo
where conditionOne = map (something a)
conditionTwo = somethingElse a
both f g x = (f x) && (g x)
As an aside, I'm also not sure why the both wrapper works; the pointfree style of realFunction behaves like the pointfree style version of contrived in that the partial application is only evaluated once (ie. if something sorted a it would only do so once). It appears that since both is not pointfree, Haskell should have the same issue that it had with the non-pointfree contrived.
If I understand correctly, you are looking for this:
contrived :: Ord a => [a] -> [a] -> [a]
contrived a = let a' = sort a in \b -> a' ++ b
-- or ... in (a' ++)
If you want the sort to be computed only once, it has to be done before the \b.
You are correct in that a compiler could optimize this. This is known as the "full laziness" optimization.
If I remember correctly, GHC does not always do it because it's not always an actual optimization, in the general case. Consider the contrived example
foo :: Int -> Int -> Int
foo x y = let a = [1..x] in length a + y
When passing both arguments, the above code works in constant space: the list elements are immediately garbage collected as they are produced.
When partially applying x, the closure for foo x only requires O(1) memory, since the list is not yet generated. Code like
let f = foo 1000 in f 10 + f 20 -- (*)
still run in constant space.
Instead, if we wrote
foo :: Int -> Int -> Int
foo x = let a = [1..x] in (length a +)
then (*) would no longer run in constant space. The first call f 10 would allocate a 1000-long list, and keep it in memory for the second call f 20.
Note that your partial application
... = (++) (sort a)
essentially means
... = let a' = sort a in \b -> a' ++ b
since argument passing involves a binding, as in let. So, the result of your sort a is kept around for all the future calls.

Combining patterns

Consider the following data type and pattern synonyms:
{-# LANGUAGE PatternSynonyms, NamedFieldPuns #-}
data Foo = Foo {
a :: Int
, b :: String
, c :: Maybe Bool
}
pattern Bar a b <- Foo { a, b }
pattern Baz c <- Foo { c }
I'd like to match a Foo but get all of a, b, and c. Something like this (invalid Haskell):
showit :: Foo -> String
showit (Bar a b & Baz c) = show a ++ b ++ show c
One option is to use a ViewPattern:
dup :: a -> (a, a)
dup a = (a, a)
showall' :: Foo -> String
showall' (dup -> (Bar a b, Baz c)) = show a ++ b ++ show c
But this results in a non-exhaustive match warning. But we know that Bar and Baz are irrefutable so matching each is also irrefutable.
How can this be expressed without a compiler warning?
The motivation for this is to have fine-grained pattern synonyms against fields of a large data type and allow callers to extract just the desired fields similar to records with NamedFieldPuns. Pattern synonyms don't yet support record syntax, but it's in the works : https://ghc.haskell.org/trac/ghc/ticket/8582
In my case I cannot expose the constructor from the module since I'm using the "smart-constructor" pattern, and hence cannot give callers the benefit of record pattern matching with NamedFieldPuns.
See https://stackoverflow.com/a/25161749/3198498 as the inspiration. I'm trying to expand the idea in that answer to allow callers to arbitrarily extract n of m fields, for a largish m.
Edit: It turns out there is a rather broad problem with PatternSynonyms and exhaustiveness checks : https://ghc.haskell.org/trac/ghc/ticket/10339
This seems to make pattern synonyms used as field extractors very unpleasant when compiling with warnings enabled.
Not sure if this is at all helpful, but I will give it a shot. Would either of these solutions be acceptable?
showit :: Foo -> String
showit x#(Bar a b) = show a ++ b ++ show (c x)
showit' :: Foo -> String
showit' x#(Bar a b) = show a ++ b ++ showthat x
where
showthat (Baz c) = show c

Omitting constructor arguments in Haskell case statements

Omitting function arguments is a nice tool for concise Haskell code.
h :: String -> Int
h = (4 +) . length
What about omitting data constructor arguments in case statements. The following code might be considered a little grungy, where s and i are the final arguments in A and B but are repeated as the final arguments in the body of each case match.
f :: Foo -> Int
f = \case
A s -> 4 + length s
B i -> 2 + id i
Is there a way to omit such arguments in case pattern matching? For constructors with a large number of arguments, this would radically shorten code width. E.g. the following pseudo code.
g :: Foo -> Int
g = \case
{- match `A` constructor -> function application to A's arguments -}
A -> (4 +) . length
{- match `B` constructor -> function application to B's arguments -}
B -> (2 +) . id
The GHC extension RecordWildCards lets you concisely bring all the fields of a constructor into scope (of course, this requires you to give names to those fields).
{-# LANGUAGE LambdaCase, RecordWildCards #-}
data Foo = Foo {field1, field2 :: Int} | Bar {field1 :: Int}
baz = \case
Foo{..} -> 4 + field2
Bar{..} -> 2 + field1
-- plus it also "sucks in" fields from a scope
mkBar400 = let field1 = 400 in Bar{..}
`
You can always refactor case statements on constructors into a single function so that from then on you only pass your concise function definitions as arguments to these specific functions. Allow me to illustrate.
Consider the Maybe a datatype:
data Maybe a = Nothing | Just a
Should you now need to define a function f :: Maybe a -> b (for some fixed b and perhaps also a), instead of writing it like
f Nothing = this
f (Just x) = that x
you could start by first defining a function
maybe f _ Nothing = f
maybe _ g (Just x) = g x
and then f can by defined as maybe this that. Pretty much as what happens with all the familiar recursion patterns.
This way you're effectively refactoring out case statements. The code gets arguably cleaner and it does not require language extensions.

Defining a suite of tests for a class

This question picks up where Haskell QuickCheck best practices (especially when testing type classes) left off.
I have a class and a bunch of implementations of that class. Something like this:
import Test.QuickCheck
import Control.Applicative
import Test.Framework
import Test.Framework.Providers.QuickCheck2
class C c where
f :: c -> Int
data A = A Int deriving Show
instance C A where
f (A a) = 2*a
data B = B Int deriving Show
instance C B where
f (B b) = 2*b
All of my implementations should satisfy a certain property. For example:
prop_f_is_even :: C c => c -> Property
prop_f_is_even x = property $ even (f x)
I want to test that property for each of the implementations. I can do something like this. (I'm using Test.Framework.)
instance Arbitrary A where
arbitrary = A <$> arbitrary
instance Arbitrary B where
arbitrary = B <$> arbitrary
test :: Test
test = testGroup "Whole buncha tests"
[
testProperty "prop_f_is_even - A" (prop_f_is_even :: A -> Property),
testProperty "prop_f_is_even - B" (prop_f_is_even :: B -> Property)
-- continue on for all combinations of properties and implementations
]
But in my case, I have dozens of properties to test, and a dozen or so
classes, so that approach is error-prone, and a hassle.
(A common mistake I make is to cut-and paste tests, but forget to change
the type name, so I end up testing A twice for that property, without testing B.)
I have a solution, which I'll post below in case anyone else finds it helpful.
This is my solution.
cProperties :: C t => String -> [(String, t -> Property)]
cProperties s =
[
("prop_f_is_even: " ++ s, prop_f_is_even)
-- plus any other tests that instances of C should satisfy
]
makeTests :: (Arbitrary t, Show t) => [(String, t -> Property)] -> [Test]
makeTests ts = map (\(s,t) -> testProperty s t) ts
aProperties :: [(String, A -> Property)]
aProperties = cProperties "A"
bProperties :: [(String, B -> Property)]
bProperties = cProperties "B"
easierTest :: Test
easierTest =
testGroup "tests" (makeTests aProperties ++ makeTests bProperties)
With this approach, if I want to add another property that all instances of C should satisfy, I just add it to cProperties. And if I create another instance of C, call it D, then I define dProperties similarly to aProperties and bProperties, and then update easierTest.
EDIT:
One disadvantage of this approach is that all tests in cProperties have to have the type signature t -> Property. I myself have not found this to be a hindrance because in cases where I apply this technique, I have already -- for unrelated reasons -- defined a type that encompasses all of the data for a test.
Another disadvantage is that, in ghci, I can no longer type, for example:
quickCheck prop_f_is_even
Now I have to type something like this:
quickCheck (prop_f_is_even :: A -> Property)

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.
You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.
To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2
Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.
You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Resources