How to avoid repetition when "factoring" a Haskell function into two functions with various intermediary types?

How to avoid repetition when "factoring" a Haskell function into two functions with various intermediary types? - haskell

Consider the following Haskell code:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TemplateHaskell #-}
import Data.Singletons.TH
singletons [d| data SimpleType = Aaa | Bbb | Ccc | Ddd
deriving (Read)
|]
-- each SimpleType value has an associated type
type family Parsed (t :: SimpleType) :: * where
Parsed Aaa = [Int]
Parsed Bbb = Maybe Int
Parsed Ccc = (Int, Int)
Parsed Ddd = Int
forth :: SSimpleType t -> Int -> Parsed t
forth SAaa x = [x,x*2,x*3]
forth SBbb x = Just x
forth SCcc x = (1337, x)
forth SDdd x = x
back :: SSimpleType t -> Parsed t -> Int
back SAaa [_, y, _] = y + 5
back SBbb (Just y) = y - 7
back SCcc (y1, y2) = y1 + y2
back SDdd y = y * 2
helper b = back b . forth b
go :: SimpleType -> Int -> Int
go Aaa = helper SAaa
go Bbb = helper SBbb
go Ccc = helper SCcc
go Ddd = helper SDdd
main = do
-- SimpleType value comes at run-time
val <- readLn
putStrLn $ show $ go val 100
Is it possible to avoid the repetition when go is defined? In other words, is there a way to write something like:
go val = helper (someMagicFunction val)
singletons do not have to be a part of the solution,
... but the idea that go is factored into forth and back with intermediary types dependent on Simple should be preserved.

You can use toSing from SingKind to convert a value of SimpelType to a value of SomeSing SimpleType, which is an existentially quantified wrapper around Sing SimpleType. You can then unwrap that value to get Sing SimpleType, which you can then pass to back and forth:
go :: SimpleType -> Int -> Int
go val x =
case toSing val of
SomeSing s -> back s $ forth s x
An instance of SingKind is generated for you (among many other things) by the singletons splice that you're using.
Note that, while a single-branch case is asking to be a let, this wouldn't compile:
go val x =
let (SomeSing s) = toSing val
in back s $ forth s x
This is prohibited, because let can be recursive, and since unwrapping a GADT may bring new types into context, it may result in creating an infinite type. A case branch, on the other hand, cannot be recursive, so this works. (credit for this explanation to #HTNW)
But a helper function would also work:
go val x = helper $ toSing val
where
helper (SomeSing s) = back s $ forth s x

Related

Pattern matching on a private data constructor

I'm writing a simple ADT for grid axis. In my application grid may be either regular (with constant step between coordinates), or irregular (otherwise). Of course, the regular grid is just a special case of irregular one, but it may worth to differentiate between them in some situations (for example, to perform some optimizations). So, I declare my ADT as the following:
data GridAxis = RegularAxis (Float, Float) Float -- (min, max) delta
| IrregularAxis [Float] -- [xs]
But I don't want user to create malformed axes with max < min or with unordered xs list. So, I add "smarter" construction functions which perform some basic checks:
regularAxis :: (Float, Float) -> Float -> GridAxis
regularAxis (a, b) dx = RegularAxis (min a b, max a b) (abs dx)
irregularAxis :: [Float] -> GridAxis
irregularAxis xs = IrregularAxis (sort xs)
I don't want user to create grids directly, so I don't add GridAxis data constructors into module export list:
module GridAxis (
GridAxis,
regularAxis,
irregularAxis,
) where
But it turned out that after having this done I cannot use pattern matching on GridAxis anymore. Trying to use it
import qualified GridAxis as GA
test :: GA.GridAxis -> Bool
test axis = case axis of
GA.RegularAxis -> True
GA.IrregularAxis -> False
gives the following compiler error:
src/Physics/ImplicitEMC.hs:7:15:
Not in scope: data constructor `GA.RegularAxis'
src/Physics/ImplicitEMC.hs:8:15:
Not in scope: data constructor `GA.IrregularAxis'
Is there something to work this around?

You can define constructor pattern synonyms. This lets you use the same name for smart construction and "dumb" pattern matching.
{-# LANGUAGE PatternSynonyms #-}
module GridAxis (GridAxis, pattern RegularAxis, pattern IrregularAxis) where
import Data.List
data GridAxis = RegularAxis_ (Float, Float) Float -- (min, max) delta
| IrregularAxis_ [Float] -- [xs]
-- The line with "<-" defines the matching behavior
-- The line with "=" defines the constructor behavior
pattern RegularAxis minmax delta <- RegularAxis_ minmax delta where
RegularAxis (a, b) dx = RegularAxis_ (min a b, max a b) (abs dx)
pattern IrregularAxis xs <- IrregularAxis_ xs where
IrregularAxis xs = IrregularAxis_ (sort xs)
Now you can do:
module Foo
import GridAxis
foo :: GridAxis -> a
foo (RegularAxis (a, b) d) = ...
foo (IrregularAxis xs) = ...
And also use RegularAxis and IrregularAxis as smart constructors.

This looks as a use case for pattern synonyms.
Basically you don't export the real constructor, but only a "smart" one
{-# LANGUAGE PatternSynonyms #-}
module M(T(), SmartCons, smartCons) where
data T = RealCons Int
-- the users will construct T using this
smartCons :: Int -> T
smartCons n = if even n then RealCons n else error "wrong!"
-- ... and destruct T using this
pattern SmartCons n <- RealCons n
Another module importing M can then use
case someTvalue of
SmartCons n -> use n
and e.g.
let value = smartCons 23 in ...
but can not use the RealCons directly.
If you prefer to stay in basic Haskell, without extensions, you can use a "view type"
module M(T(), smartCons, Tview(..), toView) where
data T = RealCons Int
-- the users will construct T using this
smartCons :: Int -> T
smartCons n = if even n then RealCons n else error "wrong!"
-- ... and destruct T using this
data Tview = Tview Int
toView :: T -> Tview
toView (RealCons n) = Tview n
Here, users have full access to the view type, which can be constructed/destructed freely, but have only a restricted start constructor for the actual type T. Destructing the actual type T is possible by moving to the view type
case toView someTvalue of
Tview n -> use n
For nested patterns, things become more cumbersome, unless you enable other extensions such as ViewPatterns.

Test if a value matches a constructor

Say I have a data type like so:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
Now I wish to filter out the elements matching a given constructor from a [NumCol]. I can write it for, say, Pair:
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter is_pair
where is_pair (Pair _ _) = True
is_pair _ = False
This works, but it's not generic. I have to write a separate function for is_single, is_lots, etc.
I wish instead I could write:
get_pairs = filter (== Pair)
But this only works for type constructors that take no arguments (i.e. Empty).
So the question is, how can I write a function that takes a value and a constructor, and returns whether the value matches the constructor?

At least get_pairs itself can be defined relatively simply by using a list comprehension to filter instead:
get_pairs xs = [x | x#Pair {} <- xs]
For a more general solution of matching constructors, you can use prisms from the lens package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.Extras (is)
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
-- Uses Template Haskell to create the Prisms _Empty, _Single, _Pair and _Lots
-- corresponding to your constructors
makePrisms ''NumCol
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter (is _Pair)

Tags of tagged unions ought to be first-class values, and with a wee bit of effort, they are.
Jiggery-pokery alert:
{-# LANGUAGE GADTs, DataKinds, KindSignatures,
TypeFamilies, PolyKinds, FlexibleInstances,
PatternSynonyms
#-}
Step one: define type-level versions of the tags.
data TagType = EmptyTag | SingleTag | PairTag | LotsTag
Step two: define value-level witnesses for the representability of the type-level tags. Richard Eisenberg's Singletons library will do this for you. I mean something like this:
data Tag :: TagType -> * where
EmptyT :: Tag EmptyTag
SingleT :: Tag SingleTag
PairT :: Tag PairTag
LotsT :: Tag LotsTag
And now we can say what stuff we expect to find associated with a given tag.
type family Stuff (t :: TagType) :: * where
Stuff EmptyTag = ()
Stuff SingleTag = Int
Stuff PairTag = (Int, Int)
Stuff LotsTag = [Int]
So we can refactor the type you first thought of
data NumCol :: * where
(:&) :: Tag t -> Stuff t -> NumCol
and use PatternSynonyms to recover the behaviour you had in mind:
pattern Empty = EmptyT :& ()
pattern Single i = SingleT :& i
pattern Pair i j = PairT :& (i, j)
pattern Lots is = LotsT :& is
So what's happened is that each constructor for NumCol has turned into a tag indexed by the kind of tag it's for. That is, constructor tags now live separately from the rest of the data, synchronized by a common index which ensures that the stuff associated with a tag matches the tag itself.
But we can talk about tags alone.
data Ex :: (k -> *) -> * where -- wish I could say newtype here
Witness :: p x -> Ex p
Now, Ex Tag, is the type of "runtime tags with a type level counterpart". It has an Eq instance
instance Eq (Ex Tag) where
Witness EmptyT == Witness EmptyT = True
Witness SingleT == Witness SingleT = True
Witness PairT == Witness PairT = True
Witness LotsT == Witness LotsT = True
_ == _ = False
Moreover, we can easily extract the tag of a NumCol.
numColTag :: NumCol -> Ex Tag
numColTag (n :& _) = Witness n
And that allows us to match your specification.
filter ((Witness PairT ==) . numColTag) :: [NumCol] -> [NumCol]
Which raises the question of whether your specification is actually what you need. The point is that detecting a tag entitles you an expectation of that tag's stuff. The output type [NumCol] doesn't do justice to the fact that you know you have just the pairs.
How might you tighten the type of your function and still deliver it?

One approach is to use DataTypeable and the Data.Data module. This approach relies on two autogenerated typeclass instances that carry metadata about the type for you: Typeable and Data. You can derive them with {-# LANGUAGE DeriveDataTypeable #-}:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int] deriving (Typeable, Data)
Now we have a toConstr function which, given a value, gives us a representation of its constructor:
toConstr :: Data a => a -> Constr
This makes it easy to compare two terms just by their constructors. The only remaining problem is that we need a value to compare against when we define our predicate! We can always just create a dummy value with undefined, but that's a bit ugly:
is_pair x = toConstr x == toConstr (Pair undefined undefined)
So the final thing we'll do is define a handy little class that automates this. The basic idea is to call toConstr on non-function values and recurse on any functions by first passing in undefined.
class Constrable a where
constr :: a -> Constr
instance Data a => Constrable a where
constr = toConstr
instance Constrable a => Constrable (b -> a) where
constr f = constr (f undefined)
This relies on FlexibleInstance, OverlappingInstances and UndecidableInstances, so it might be a bit evil, but, using the (in)famous eyeball theorem, it should be fine. Unless you add more instances or try to use it with something that isn't a constructor. Then it might blow up. Violently. No promises.
Finally, with the evil neatly contained, we can write an "equal by constructor" operator:
(=|=) :: (Data a, Constrable b) => a -> b -> Bool
e =|= c = toConstr e == constr c
(The =|= operator is a bit of a mnemonic, because constructors are syntactically defined with a |.)
Now you can write almost exactly what you wanted!
filter (=|= Pair)
Also, maybe you'd want to turn off the monomorphism restriction. In fact, here's the list of extensions I enabled that you can just use:
{-# LANGUAGE DeriveDataTypeable, FlexibleInstances, NoMonomorphismRestriction, OverlappingInstances, UndecidableInstances #-}
Yeah, it's a lot. But that's what I'm willing to sacrifice for the cause. Of not writing extra undefineds.
Honestly, if you don't mind relying on lens (but boy is that dependency a doozy), you should just go with the prism approach. The only thing to recommend mine is that you get to use the amusingly named Data.Data.Data class.

Omitting constructor arguments in Haskell case statements

Omitting function arguments is a nice tool for concise Haskell code.
h :: String -> Int
h = (4 +) . length
What about omitting data constructor arguments in case statements. The following code might be considered a little grungy, where s and i are the final arguments in A and B but are repeated as the final arguments in the body of each case match.
f :: Foo -> Int
f = \case
A s -> 4 + length s
B i -> 2 + id i
Is there a way to omit such arguments in case pattern matching? For constructors with a large number of arguments, this would radically shorten code width. E.g. the following pseudo code.
g :: Foo -> Int
g = \case
{- match `A` constructor -> function application to A's arguments -}
A -> (4 +) . length
{- match `B` constructor -> function application to B's arguments -}
B -> (2 +) . id

The GHC extension RecordWildCards lets you concisely bring all the fields of a constructor into scope (of course, this requires you to give names to those fields).
{-# LANGUAGE LambdaCase, RecordWildCards #-}
data Foo = Foo {field1, field2 :: Int} | Bar {field1 :: Int}
baz = \case
Foo{..} -> 4 + field2
Bar{..} -> 2 + field1
-- plus it also "sucks in" fields from a scope
mkBar400 = let field1 = 400 in Bar{..}
`

You can always refactor case statements on constructors into a single function so that from then on you only pass your concise function definitions as arguments to these specific functions. Allow me to illustrate.
Consider the Maybe a datatype:
data Maybe a = Nothing | Just a
Should you now need to define a function f :: Maybe a -> b (for some fixed b and perhaps also a), instead of writing it like
f Nothing = this
f (Just x) = that x
you could start by first defining a function
maybe f _ Nothing = f
maybe _ g (Just x) = g x
and then f can by defined as maybe this that. Pretty much as what happens with all the familiar recursion patterns.
This way you're effectively refactoring out case statements. The code gets arguably cleaner and it does not require language extensions.

Case between two unrelated types in Haskell

Is it possible to use case expression between two unrelated types in Haskell, like in this example (not working) code:
data A = A
data B = B
f x = case x of
A -> 1
B -> 2
main = do
print $ test A
return ()
I know I can use Either here, but this code is not meant to be used - I want to deeply learn the Haskell type system and see what could be done.

A and B are distinct types. If you want a function that can take values of multiple types, you need a typeclass.
data A = A
data B = B
class F a where
f :: a -> Int
instance F A where
f _ = 1
instance F B where
f _ = 2
main = do
print $ f A

No, this is not possible with a normal case statement, but you can hack this sort of thing using type classes:
data A = A
data B = B
class Test a where
test :: a -> Int
instance Test A where test = const 1
instance Test B where test = const 2
main = print $ test A
But you should only use this if it's really required, as it gets messy very soon and you end up with needing a lots of extensions to the type system (UndecidableInstances, OverlappingInstances, ...)

Rather than writing your own type class for f, you could use the Typeable
class which has some support by ghc. You don't need to use Dynamic here, but in
some cases it is needed.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Dynamic
data A = A deriving (Typeable)
data B = B deriving (Typeable)
f x | Just A <- fromDynamic x = 1
| Just B <- fromDynamic x = 2
f2 x | Just A <- cast x = 1
| Just B <- cast x = 2
main = do
print $ f (toDyn A)
print $ f2 A

Sort by constructor ignoring (part of) value

Suppose I have
data Foo = A String Int | B Int
I want to take an xs :: [Foo] and sort it such that all the As are at the beginning, sorted by their strings, but with the ints in the order they appeared in the list, and then have all the Bs at the end, in the same order they appeared.
In particular, I want to create a new list containg the first A of each string and the first B.
I did this by defining a function taking Foos to (Int, String)s and using sortBy and groupBy.
Is there a cleaner way to do this? Preferably one that generalizes to at least 10 constructors.
Typeable, maybe? Something else that's nicer?
EDIT: This is used for processing a list of Foos that is used elsewhere. There is already an Ord instance which is the normal ordering.

You can use
sortBy (comparing foo)
where foo is a function that extracts the interesting parts into something comparable (e.g. Ints).
In the example, since you want the As sorted by their Strings, a mapping to Int with the desired properties would be too complicated, so we use a compound target type.
foo (A s _) = (0,s)
foo (B _) = (1,"")
would be a possible helper. This is more or less equivalent to Tikhon Jelvis' suggestion, but it leaves space for the natural Ord instance.

To make it easier to build comparison function for ADTs with large number of constructors, you can map values to their constructor index with SYB:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Generics
data Foo = A String Int | B Int deriving (Show, Eq, Typeable, Data)
cIndex :: Data a => a -> Int
cIndex = constrIndex . toConstr
Example:
*Main Data.Generics> cIndex $ A "foo" 42
1
*Main Data.Generics> cIndex $ B 0
2

Edit:After re-reading your question, I think the best option is to make Foo an instance of Ord. I do not think there is any way to do this automatically that will act the way you want (just using deriving will create different behavior).
Once Foo is an instance of Ord, you can just use sort from Data.List.
In your exact example, you can do something like this:
data Foo = A String Int | B Int deriving (Eq)
instance Ord Foo where
(A _ _) <= (B _) = True
(A s _) <= (A s' _) = s <= s'
(B _) <= (B _) = True
When something is an instance of Ord, it means the data type has some ordering. Once we know how to order something, we can use a bunch of existing functions (like sort) on it and it will behave how you want. Anything in Ord has to be part of Eq, which is what the deriving (Eq) bit does automatically.
You can also derive Ord. However, the behavior will not be exactly what you want--it will order by all of the fields if it has to (e.g. it will put As with the same string in order by their integers).
Further edit: I was thinking about it some more and realized my solution is probably semantically wrong.
An Ord instance is a statement about your whole data type. For example, I'm saying that Bs are always equal with each other when the derived Eq instance says otherwise.
If the data your representing always behaves like this (that is, Bs are all equal and As with the same string are all equal) then an Ord instance makes sense. Otherwise, you should not actually do this.
However, you can do something almost exactly like this: write your own special compare function (Foo -> Foo -> Ordering) that encapsulates exactly what you want to do then use sortBy. This properly codifies that your particular sorting is special rather than the natural ordering of the data type.

You could use some template haskell to fill in the missing transitive cases. The mkTransitiveLt creates the transitive closure of the given cases (if you order them least to greatest). This gives you a working less-than, which can be turned into a function that returns an Ordering.
{-# LANGUAGE TemplateHaskell #-}
import MkTransitiveLt
import Data.List (sortBy)
data Foo = A String Int | B Int | C | D | E deriving(Show)
cmp a b = $(mkTransitiveLt [|
case (a, b) of
(A _ _, B _) -> True
(B _, C) -> True
(C, D) -> True
(D, E) -> True
(A s _, A s' _) -> s < s'
otherwise -> False|])
lt2Ord f a b =
case (f a b, f b a) of
(True, _) -> LT
(_, True) -> GT
otherwise -> EQ
main = print $ sortBy (lt2Ord cmp) [A "Z" 1, A "A" 1, B 1, A "A" 0, C]
Generates:
[A "A" 1,A "A" 0,A "Z" 1,B 1,C]
mkTransitiveLt must be defined in a separate module:
module MkTransitiveLt (mkTransitiveLt)
where
import Language.Haskell.TH
mkTransitiveLt :: ExpQ -> ExpQ
mkTransitiveLt eq = do
CaseE e ms <- eq
return . CaseE e . reverse . foldl go [] $ ms
where
go ms m#(Match (TupP [a, b]) body decls) = (m:ms) ++
[Match (TupP [x, b]) body decls | Match (TupP [x, y]) _ _ <- ms, y == a]
go ms m = m:ms

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

How to avoid repetition when "factoring" a Haskell function into two functions with various intermediary types? - haskell

Related

Pattern matching on a private data constructor

Test if a value matches a constructor

Omitting constructor arguments in Haskell case statements

Case between two unrelated types in Haskell

Sort by constructor ignoring (part of) value

Categories

Resources