GADTs or phantom types to type-check function calls but keep homogeneity of type - haskell

I assume the following problem can be solved using type arithmetic but haven't found the solution yet.
Problem
I have a finite map from strings to values (using Tries as implementation) that I parse from a binary/text file (json, xml, ...).
type Value = ...
type Attributes = Data.Trie Value
data Object = Object Attributes
Each map has the same type of values but not the same set of keys.
I group maps with the same set of keys together to be able to prevent having to type-switch all the time I have a specialised function that requires certain keys:
data T1
data T2
...
data Object a where
T1 :: Attributes -> Object T1
T2 :: Attributes -> Object T2
...
This allows me to write something like:
f1 :: Object T1 -> ...
instead of
f1 :: Object ->
f1 o | check_if_T1 o = ...
This works but has two disadvantages:
Homogeneous lists of Object now become heterogeneous, i.e. I cannot have a list [Object] anymore.
I need to write a lot of boilerplate to get/set attributes:
get :: Object a -> Attributes
get (T1 a) = a
get (T2 a) = a
...
Question
Is there a better way to specialise functions depending on the constructor of an ADT?
How could I regain the ability to have a list [Object]? Is there a specialized version of Dynamic that only allows certain types?
I thought about wrapping the Object again, but this would add a lot of boilerplate. For instance,
data TObject = TT1 T1 | TT2 T2 ...
What I need is:
get :: a -> TObject -> Object a
So that I can then derive:
collect :: a -> [TObject] -> [Object a]
I looked into HList but I don't think it fits my problem. Especially, since the order of types in [Object] is not known at compile time.
It sounds to me like this can be solved using functional dependency / type arithmetic but I simply haven't found a nice way yet.

If all the constructors return a monomorphic Object type and there's no recursion, you might want to think about just using separate types. Instead of
data T1
data T2
data Object a where
T1 :: Attributes -> Object T1
T2 :: Attributes -> Object T2
consider
data T1 = T1 Attributes
data T2 = T2 Attributes
Dynamic is one way, and using the above you could just add deriving Typeable and be done. Alternately, you can do it by hand:
data TSomething = It's1 T1 | It's2 T2
getT1s :: [TSomething] -> [T1]
getT2s :: [TSomething] -> [T2]
getT1s xs = [t1 | It's1 t1 <- xs]
getT2s xs = [t2 | It's2 t2 <- xs]
As you say, this involves a bit of boilerplate. The Typeable version looks a bit nicer:
deriving Typeable T1
deriving Typeable T2
-- can specialize at the call-site to
-- getTs :: [Dynamic] -> [T1] or
-- getTs :: [Dynamic] -> [T2]
getTs :: Typeable a => [Dynamic] -> [a]
getTs xs = [x | Just x <- map fromDynamic xs]

Related

Deriving Eq and Show for an ADT that contains fields that can't have Eq or Show

I'd like to be able to derive Eq and Show for an ADT that contains multiple fields. One of them is a function field. When doing Show, I'd like it to display something bogus, like e.g. "<function>"; when doing Eq, I'd like it to ignore that field. How can I best do this without hand-writing a full instance for Show and Eq?
I don't want to wrap the function field inside a newtype and write my own Eq and Show for that - it would be too bothersome to use like that.
One way you can get proper Eq and Show instances is to, instead of hard-coding that function field, make it a type parameter and provide a function that just “erases” that field. I.e., if you have
data Foo = Foo
{ fooI :: Int
, fooF :: Int -> Int }
you change it to
data Foo' f = Foo
{ _fooI :: Int
, _fooF :: f }
deriving (Eq, Show)
type Foo = Foo' (Int -> Int)
eraseFn :: Foo -> Foo' ()
eraseFn foo = foo{ fooF = () }
Then, Foo will still not be Eq- or Showable (which after all it shouldn't be), but to make a Foo value showable you can just wrap it in eraseFn.
Typically what I do in this circumstance is exactly what you say you don’t want to do, namely, wrap the function in a newtype and provide a Show for that:
data T1
{ f :: X -> Y
, xs :: [String]
, ys :: [Bool]
}
data T2
{ f :: OpaqueFunction X Y
, xs :: [String]
, ys :: [Bool]
}
deriving (Show)
newtype OpaqueFunction a b = OpaqueFunction (a -> b)
instance Show (OpaqueFunction a b) where
show = const "<function>"
If you don’t want to do that, you can instead make the function a type parameter, and substitute it out when Showing the type:
data T3' a
{ f :: a
, xs :: [String]
, ys :: [Bool]
}
deriving (Functor, Show)
newtype T3 = T3 (T3' (X -> Y))
data Opaque = Opaque
instance Show Opaque where
show = const "..."
instance Show T3 where
show (T3 t) = show (Opaque <$ t)
Or I’ll refactor my data type to derive Show only for the parts I want to be Showable by default, and override the other parts:
data T4 = T4
{ f :: X -> Y
, xys :: T4' -- Move the other fields into another type.
}
instance Show T4 where
show (T4 f xys) = "T4 <function> " <> show xys
data T4' = T4'
{ xs :: [String]
, ys :: [Bool]
}
deriving (Show) -- Derive ‘Show’ for the showable fields.
Or if my type is small, I’ll use a newtype instead of data, and derive Show via something like OpaqueFunction:
{-# LANGUAGE DerivingVia #-}
newtype T5 = T5 (X -> Y, [String], [Bool])
deriving (Show) via (OpaqueFunction X Y, [String], [Bool])
You can use the iso-deriving package to do this for data types using lenses if you care about keeping the field names / record accessors.
As for Eq (or Ord), it’s not a good idea to have an instance that equates values that can be observably distinguished in some way, since some code will treat them as identical and other code will not, and now you’re forced to care about stability: in some circumstance where I have a == b, should I pick a or b? This is why substitutability is a law for Eq: forall x y f. (x == y) ==> (f x == f y) if f is a “public” function that upholds the invariants of the type of x and y (although floating-point also violates this). A better choice is something like T4 above, having equality only for the parts of a type that can satisfy the laws, or explicitly using comparison modulo some function at use sites, e.g., comparing someField.
The module Text.Show.Functions in base provides a show instance for functions that displays <function>. To use it, just:
import Text.Show.Functions
It just defines an instance something like:
instance Show (a -> b) where
show _ = "<function>"
Similarly, you can define your own Eq instance:
import Text.Show.Functions
instance Eq (a -> b) where
-- all functions are equal...
-- ...though some are more equal than others
_ == _ = True
data Foo = Foo Int Double (Int -> Int) deriving (Show, Eq)
main = do
print $ Foo 1 2.0 (+1)
print $ Foo 1 2.0 (+1) == Foo 1 2.0 (+2) -- is True
This will be an orphan instance, so you'll get a warning with -Wall.
Obviously, these instances will apply to all functions. You can write instances for a more specialized function type (e.g., only for Int -> String, if that's the type of the function field in your data type), but there is no way to simultaneously (1) use the built-in Eq and Show deriving mechanisms to derive instances for your datatype, (2) not introduce a newtype wrapper for the function field (or some other type polymorphism as mentioned in the other answers), and (3) only have the function instances apply to the function field of your data type and not other function values of the same type.
If you really want to limit applicability of the custom function instances without a newtype wrapper, you'd probably need to build your own generics-based solution, which wouldn't make much sense unless you wanted to do this for a lot of data types. If you go this route, then the Generics.Deriving.Show and Generics.Deriving.Eq modules in generic-deriving provide templates for these instances which could be modified to treat functions specially, allowing you to derive per-datatype instances using some stub instances something like:
instance Show Foo where showsPrec = myGenericShowsPrec
instance Eq Foo where (==) = myGenericEquality
I proposed an idea for adding annotations to fields via fields, that allows operating on behaviour of individual fields.
data A = A
{ a :: Int
, b :: Int
, c :: Int -> Int via Ignore (Int->Int)
}
deriving
stock GHC.Generic
deriving (Eq, Show)
via Generically A -- assuming Eq (Generically A)
-- Show (Generically A)
But this is already possible with the "microsurgery" library, but you might have to write some boilerplate to get it going. Another solution is to write separate behaviour in "sums-of-products style"
data A = A Int Int (Int->Int)
deriving
stock GHC.Generic
deriving
anyclass SOP.Generic
deriving (Eq, Show)
via A <-𝈖-> '[ '[ Int, Int, Ignore (Int->Int) ] ]

How to use the same record selector two ways within a function? Lenses?

I have some data that have different representations based on a type parameter, a la Sandy Maguire's Higher Kinded Data. Here are two examples:
wholeMyData :: MyData Z
wholeMyData = MyData 1 'w'
deltaMyData :: MyData Delta
deltaMyData = MyData Nothing (Just $ Left 'b')
I give some of the implementation details below, but first the actual question.
I often want to get a field of the data, usually via a local definition like:
let x = either (Just . Left . myDataChar) myDataChar -- myDataChar a record of MyData
It happens so often I would like to make a standard combinator,
getSubDelta :: ( _ -> _ ) -> Either a b -> Maybe (Either c d)
getSubDelta f = either (Just . Left . f) f
but filling in that signature is problematic. The easy solution is to just supply the record selector function twice,
getSubDelta :: (a->c) -> (b->d) -> Either a b -> Maybe (Either c d)
getSubDelta f g = either (Just . Left . f) g
but that is unseemly. So my question. Is there a way I can fill in the signature above? I'm assuming there is probably a lens based solution, what would that look like? Would it help with deeply nested data? I can't rely on the data types always being single constructor, so prisms? Traversals? My lens game is weak, so I was hoping to get some advice before I proceed.
Thanks!
Some background. I defined a generic method of performing "deltas", via a mix of GHC.Generics and type families. The gist is to use a type family in the definition of the data type. Then, depending how the type is parameterized, the records will either represent whole data or a change to existing data.
For instance, I define the business data using DeltaPoints.
MyData f = MyData { myDataInt :: DeltaPoint f Int
, myDataChar :: DeltaPoint f Char} deriving Generic
The DeltaPoints are implemented in the library, and have different forms for Delta and Z states.
data DeltaState = Z | Delta deriving (Show,Eq,Read)
type family DeltaPoint (st :: DeltaState) a where
DeltaPoint Z a = a
DeltaPoint Delta a = Maybe (Either a (DeltaOf a))
So a DeltaPoint Z a is just the original data, a, and a DeltaPoint Delta a, may or may not be present, and if it is present will either be a replacement of the original (Left) or an update (DeltaOf a).
The runtime delta functionality is encapsulated in a type class.
class HasDelta a where
type DeltaOf a
delta :: a -> a -> Maybe (Either a (DeltaOf a))
applyDeltaOf :: a -> DeltaOf a -> Maybe a
And with the use of Generics, I can usually get the delta capabilities with something like:
instance HasDelta (MyData Z) where
type (DeltaOf (MyData Z)) = MyData Delta
I think you probably want:
{-# LANGUAGE RankNTypes #-}
getSubDelta :: (forall f . (dat f -> DeltaPoint f fld))
-> Either (dat Z) (dat Delta)
-> Maybe (Either (DeltaPoint Z fld) (DeltaOf fld))
getSubDelta sel = either (Just . Left . sel) sel
giving:
x :: Either (MyData Z) (MyData Delta)
-> Maybe (Either (DeltaPoint Z Char) (DeltaOf Char))
x = getSubDelta myDataChar
-- same as: x = either (Just . Left . myDataChar) myDataChar

Deserializing many network messages without using an ad-hoc parser implementation

I have a question pertaining to deserialization. I can envision a solution using Data.Data, Data.Typeable, or with GHC.Generics, but I'm curious if it can be accomplished without generics, SYB, or meta-programming.
Problem Description:
Given a list of [String] that is known to contain the fields of a locally defined algebraic data type, I would like to deserialize the [String] to construct the target data type. I could write a parser to do this, but I'm looking for a generalized solution that will deserialize to an arbitrary number of data types defined within the program without writing a parser for each type. With knowledge of the number and type of value constructors an algebraic type has, it's as simple as performing a read on each string to yield the appropriate values necessary to build up the type. However, I don't want to use generics, reflection, SYB, or meta-programming (unless it's otherwise impossible).
Say I have around 50 types defined similar to this (all simple algebraic types composed of basic primitives (no nested or recursive types, just different combinations and orderings of primitives) :
data NetworkMsg = NetworkMsg { field1 :: Int, field2 :: Int, field3 :: Double}
data NetworkMsg2 = NetworkMsg2 { field1 :: Double, field2 :: Int, field3 :: Double }
I can determine the data-type to be associated with a [String] I've received over the network using a tag id that I parse before each [String].
Possible conjectured solution path:
Since data constructors are first-class values in Haskell, and actually have a type-- Can NetworkMsg constructor be thought of as a function, such as:
NetworkMsg :: Int -> Int -> Double -> NetworkMsg
Could I transform this function into a function on tuples using uncurryN then copy the [String] into a tuple of the same shape the function now takes?
NetworkMsg' :: (Int, Int, Double) -> NetworkMsg
I don't think this would work because I'd need knowledge of the value constructors and type information, which would require Data.Typeable, reflection, or some other metaprogramming technique.
Basically, I'm looking for automatic deserialization of many types without writing type instance declarations or analyzing the type's shape at run-time. If it's not feasible, I'll do it an alternative way.
You are correct in that the constructors are essentially just functions so you can write generic instances for any number of types by just writing instances for the functions. You'll still need to write a separate instance
for all the different numbers of arguments, though.
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
import Text.Read
import Control.Applicative
class FieldParser p r where
parseFields :: p -> [String] -> Maybe r
instance Read a => FieldParser (a -> r) r where
parseFields con [a] = con <$> readMaybe a
parseFields _ _ = Nothing
instance (Read a, Read b) => FieldParser (a -> b -> r) r where
parseFields con [a, b] = con <$> readMaybe a <*> readMaybe b
parseFields _ _ = Nothing
instance (Read a, Read b, Read c) => FieldParser (a -> b -> c -> r) r where
parseFields con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseFields _ _ = Nothing
{- etc. for as many arguments as you need -}
Now you can use this type class to parse any message based on the constructor as long as the type-checker is able to figure out the resulting message type from context (i.e. it is not able to deduce it simply from the given constructor for these sort of multi-param type class instances).
data Test1 = Test1 {fieldA :: Int} deriving Show
data Test2 = Test2 {fieldB ::Int, fieldC :: Float} deriving Show
test :: String -> [String] -> IO ()
test tag fields = case tag of
"Test1" -> case parseFields Test1 fields of
Just (a :: Test1) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
"Test2" -> case parseFields Test2 fields of
Just (a :: Test2) -> putStrLn $ "Succesfully parsed " ++ show a
Nothing -> putStrLn "Parse error"
I'd like to know how exactly you use the message types in the application, though, because having each message as its separate type makes it very difficult to have any sort of generic message handler.
Is there some reason why you don't simply have a single message data type? Such as
data NetworkMsg
= NetworkMsg1 {fieldA :: Int}
| NetworkMsg2 {fieldB :: Int, fieldC :: Float}
Now, while the instances are built in pretty much the same way, you get much better type inference since the result type is always known.
instance Read a => MessageParser (a -> NetworkMsg) where
parseMsg con [a] = con <$> readMaybe a
instance (Read a, Read b) => MessageParser (a -> b -> NetworkMsg) where
parseMsg con [a, b] = con <$> readMaybe a <*> readMaybe b
instance (Read a, Read b, Read c) => MessageParser (a -> b -> c -> NetworkMsg) where
parseMsg con [a, b, c] = con <$> readMaybe a <*> readMaybe b <*> readMaybe c
parseMessage :: String -> [String] -> Maybe NetworkMsg
parseMessage tag fields = case tag of
"NetworkMsg1" -> parseMsg NetworkMsg1 fields
"NetworkMsg2" -> parseMsg NetworkMsg2 fields
_ -> Nothing
I'm also not sure why you want to do type-generic programming specifically without actually using any of the tools meant for generics. GHC.Generics, SYB or Template Haskell is usually the best solution for this kind of problem.

Map identity functor over record

I have a record type like this one:
data VehicleState f = VehicleState
{
orientation :: f (Quaternion Double),
orientationRate :: f (Quaternion Double),
acceleration :: f (V3 (Acceleration Double)),
velocity :: f (V3 (Velocity Double)),
location :: f (Coordinate),
elapsedTime :: f (Time Double)
}
deriving (Show)
This is cool, because I can have a VehicleState Signal where I have all sorts of metadata, I can have a VehicleState (Wire s e m ()) where I have the netwire semantics of each signal, or I can have a VehicleState Identity where I have actual values observed at a certain time.
Is there a good way to map back and forth between VehicleState Identity and VehicleState', defined by mapping runIdentity over each field?
data VehicleState' = VehicleState'
{
orientation :: Quaternion Double,
orientationRate :: Quaternion Double,
acceleration :: V3 (Acceleration Double),
velocity :: V3 (Velocity Double),
location :: Coordinate,
elapsedTime :: Time Double
}
deriving (Show)
Obviously it's trivial to write one, but I actually have several types like this in my real application and I keep adding or removing fields, so it is tedious.
I am writing some Template Haskell that does it, just wondering if I am reinventing the wheel.
If you're not opposed to type families and don't need too much type inference, you can actually get away with using a single datatype:
import Data.Singletons.Prelude
data Record f = Record
{ x :: Apply f Int
, y :: Apply f Bool
, z :: Apply f String
}
type Record' = Record IdSym0
test1 :: Record (TyCon1 Maybe)
test1 = Record (Just 3) Nothing (Just "foo")
test2 :: Record'
test2 = Record 2 False "bar"
The Apply type family is defined in the singletons package. It can be applied to
various type functions also defined in that package (and of course, you can define your
own). The IdSym0 has the property that Apply IdSym0 x reduces to plain x. And
TyCon1 has the property that Apply (TyCon1 f) x reduces to f x.
As demonstrated by
test1 and test2, this allows both versions of your datatype. However, you need
type annotations for most records now.

Trouble with DataKinds

I have created a very simple example of a problem I'm having using GADTs and DataKinds. My real application is obviously more complicated but this captures the essence of my situation clearly. I'm trying to create a function that can return any of the values (T1, T2) of type Test. Is there a way to accomplish this or am I getting into the realm of dependent types? The questions here seem similar but I could not find (or comprehend) an answer to my question from them. I'm just starting to understand these GHC extensions. Thanks.
similar question 1
similar question 2
{-# LANGUAGE GADTs, DataKinds, FlexibleInstances, KindSignatures #-}
module Test where
data TIdx = TI | TD
data Test :: TIdx -> * where
T1 :: Int -> Test TI
T2 :: Double -> Test TD
type T1 = Test TI
type T2 = Test TD
prob :: T1 -> T2 -> Test TIdx
prob x y = undefined
----Here is the error----
Test.hs:14:26:
Kind mis-match
The first argument of `Test' should have kind `TIdx',
but `TIdx' has kind `*'
In the type signature for `prob': prob :: T1 -> T2 -> Test TIdx
The error message you get is because the type parameter to Test needs to
have the kind TIdx, but the only types that have that kind are TI and TD.
The type TIdx has the kind *.
If I understood correctly what you are trying to express is that the result
type of prob is either Test TI or Test TD, but the actual type is
determined at runtime. However, this won't work directly. The return type
generally has to be known at compile time.
What you can do, since the GADT constructors each map to specific phatom type of kind TIdx, is to return a result that erases the phantom type with an
existential or another GADT and then recover the type later using a pattern
match.
For example, if we define two functions that require a specific kind of Test:
fun1 :: T1 -> IO ()
fun1 (T1 i) = putStrLn $ "T1 " ++ show i
fun2 :: T2 -> IO ()
fun2 (T2 d) = putStrLn $ "T2 " ++ show d
This type-checks:
data UnknownTest where
UnknownTest :: Test t -> UnknownTest
prob :: T1 -> T2 -> UnknownTest
prob x y = undefined
main :: IO ()
main = do
let a = T1 5
b = T2 10.0
p = prob a b
case p of
UnknownTest t#(T1 _) -> fun1 t
UnknownTest t#(T2 _) -> fun2 t
The notable thing here is that in the case-expression, even though the
UnknownTest GADT has erased the phantom type, the T1 and T2 constructors give enough
type information to the compiler that t recovers its exact type Test TI or
Test TD within the branch of the case-expression, allowing us to e.g. call
functions that expect those specific types.
You have two options here. Either you can infer the type of the return value from the types of arguments or you can't.
In the former case, you refine the type:
data Which :: TIdx -> * where
Fst :: Which TI
Snd :: Which TD
prob :: Which i -> T1 -> T2 -> Test i
prob Fst x y = x
prob Snd x y = y
In the latter case, you have to erase the type information:
prob :: Bool -> T1 -> T2 -> Either Int Double
prob True (T1 x) y = Left x
prob False x (T2 y) = Right y
You can also erase the type information by using an existential type:
data SomeTest = forall i . SomeTest (Test i)
prob :: Bool -> T1 -> T2 -> SomeTest
prob True x y = SomeTest x
prob False x y = SomeTest y
In this case, you cannot do anything interesting with a value of SomeTest, but you might be able in your real example.

Resources