Code reuse in Haxl - avoiding GADT constructor-per-request-type

Code reuse in Haxl - avoiding GADT constructor-per-request-type - haskell

Haxl is an amazing library, but one of the major pain points I find is caused by the fact that each sort of request to the data source requires its own constructor in the Request GADT. For example, taking the example from the tutorial:
data BlogRequest a where
FetchPosts :: BlogRequest [PostId]
FetchPostContent :: PostId -> BlogRequest PostContent
Then each of these constructors are pattern matched on and processed separately in the match function of the DataSource instance. This style results in a lot of boilerplate for a non-trivial application. Take for example an application using a relational database, where each table has a primary key. There may be many hundreds of tables so I don't want to define a constructor for each table (let alone on all the possible joins across tables...). What I really want is something like:
data DBRequest a where
RequestById :: PersistEntity a => Key a -> DBRequest (Maybe a)
I'm using persistent to create types from my tables, but that isn't a critical detail -- I just want to use a single constructor for multiple possible return types.
The problem comes when trying to write the fetch function. The usual procedure with Haxl is to pattern match on the constructor to separate out the various types of BlockedFetch requests, which in the above example would correspond to something like this:
resVars :: [ResultVar (Maybe a)]
args :: [Key a]
(psArgs, psResVars) = unzip
[(key, r) | BlockedFetch (RequestById key) r <- blockedFetches]
...then I would (somehow) group the arguments by their key type, and dispatch a SQL query for each group. But that approach won't because here may be requests for multiple PersistentEntity types (i.e. database tables), each of which is a different type a, so building the list is impossible. I've thought of using an existentially quantified type to get around this issue (something like SomeSing in the singletons library), but then I see no way to group the requests as required without pattern matching on every possible table/type.
Is there any way to achieve this sort of code reuse?

I see two approaches:
Typeable:
data DBRequest a where
RequestById :: (Typeable a, PersistEntity a) => Key a -> DBRequest (Maybe a)
or GADT "tag" type:
data Tag a where
TagValue1 :: Tag Value1
TagValue2 :: Tag Value2
TagValue3 :: Tag Value3
TagValue4 :: Tag Value4
TagValue5 :: Tag Value5
data DBRequest a where
RequestById :: PersistEntity a => Tag a => Key a -> DBRequest (Maybe a)
These are very similar patterns, especially If you use GHC-8.2,
with https://hackage.haskell.org/package/base-4.10.1.0/docs/Type-Reflection.html
(replace Tag a with TypeRep a).
Either way, you can group Key a using the tag. I haven't tried, but
dependent-map might be handy: http://hackage.haskell.org/package/dependent-map

Related

How to obtain a Data.Data.Constr etc. from a Type Representation?

I'm currently writing a minimalistic Haskell persistence framework that uses Data.Data Generics to provide persistence operations for data types in record syntax (I call them Entities here).
This works quite well overall (see code repo here: https://github.com/thma/generic-persistence), I've got just one ugly spot left.
Currently my function for looking up an entity by primary key has the following signature:
retrieveEntityById
:: forall a conn id. (Data a, IConnection conn, Show id)
=> conn -> TypeInfo -> id -> IO a
This function takes an HDBC database connection, A TypeInfo object and the primary key value id.
The TypeInfo contains data describing the type a. This Info will be used to generate and perform a select statement for primary key lookup and contruct an instance of type a from the HDBC result row.
TypeInfo contains Data.Data.Constr and a description of all constructor fields that are obtained with the following function:
-- | A function that returns a list of FieldInfos representing the
-- name, constructor and type of each field in a data type.
fieldInfo :: (Data a) => a -> [FieldInfo]
fieldInfo x = zipWith3 FieldInfo names constrs types
where
constructor = toConstr x
candidates = constrFields constructor
constrs = gmapQ toConstr x
types = gmapQ typeOf x
names =
if length candidates == length constrs
then map Just candidates
else replicate (length constrs) Nothing
Deriving this kind of information works great when having an actual Data a value at hand (for example in my function for updating an existing entity).
But in the retrieveEntityById case this is not possible, as the object has yet to loaded from the DB. That's why I have to call the function with an extra TypeInfo parameter that I create from a sample entity.
I would like to get rid of this extra parameter, to have a function signature like:
retrieveEntityById
:: forall a conn id. (Data a, IConnection conn, Show id)
=> conn -> id -> IO a
I tried several things like using a Proxy or a TypeRep, but I did not manage to derive my TypeInfo data from them.
Any hints and ideas are most welcome!

You can access metadata from Data a constraints without a value x :: a:
dataTypeOf (undefined :: a) is a representation of the type a. It takes an undefined argument because Data is an old interface from a time when undefined was considered more acceptable.
dataTypeConstrs extracts the list of constructors from that representation.
As you already noted in your code, although there is constrFields to get field names, there isn't an obvious way to get the arity of a non-record constructor. A solution to count the fields of a constructor is to use gunfold with the Const Int functor.
Although your current approach seems specialized to types with a single constructor, this is not a fundamental limitation. For more than one constructor, you could store the constructor name in its persistent encoding, and on decoding, it can be looked up in the list from dataTypeConstrs.

How can I concisely represent a heterogenous sum type in Haskell?

I'm writing a program that transcodes financial statements into a ledger. In that program I have types representing different activities:
data Withdrawal = Withdrawal { wTarget :: !Text, wAmount :: !Cash, wBalance :: !Cash }
data Fee = { fFee :: !Cash, fBalance :: !Cash }
-- many more
I use those types, because I have functions that are transaction-type specific.
I also wanted to write an activity parser that translates CSV records into those types, so I created an Activity sum type:
data Activity =
ActivityFee Fee
| ActivityWithdrawal Withdrawal
| -- ...
parseActivity :: CsvRecord -> Activity
That Activity is quite boilerplate'y. Having to have a new Activity* constructor for a new activity type is slightly cumbersome.
Is there a more idiomatic or better design pattern for this problem? Was it C++, std::variant would be convenient, because adding a new activity type wouldn't entail adding a new boilerplate constructor.
I've considered type-classes, but the problem with them is that they are not closed and I can't pattern match to create a function like applyActivity :: Activity -> Wallet -> Wallet. I see that I could make applyActivity into a function of an Activity class, but then problem is that this solution is only straightforward if only one argument is using this pattern. If we had two arguments like foo :: (ClassOne a, ClassTwo b) => a -> b -> c, then it's not clear to which class foo should belong.

One option is not bothering to define the sum type, and instead make parseActivity return the Wallet -> Wallet operation that characterizes activities, wrapped in some Parser type with an Alternative instance.
parseActivity :: CsvRecord -> Parser (Wallet -> Wallet)
You would still need to define a big Parser value using a bunch of <|> that composed the Parsers for each possible activity.
Additional operations other than Wallet -> Wallet could be supported by making the parser return a record of functions:
data ActivityOps = ActivityOps {
applyActivity :: Wallet -> Wallet,
debugActivity :: String
}
This is still not as versatile as the sum type, because it constrains beforehand the operations that we might do with the activity. To support a new operation, we would need to change the Parser ActivityOps value. With the sum type, we would simply define a new function.
A variant of this solution would be to define a typeclass like
class ActivityOps a where
applyActivity :: a -> Wallet -> Wallet
debugActivity :: a -> String
And make the Parser return some kind of existential like:
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTSyntax #-}
data Activity where
MakeActivity :: ActivityOps a => a -> Activity
This is sometimes frowned upon, but it would have the benefit of being able to easily invoke ActivityOps methods on activities of known type.

Extensible sums are a possible alternative. In that case one would write
type Activity = Sum '[Fee, Withdrawal]
and use match (\fee -> ...) (\withdrawal -> ...) as a substitute for pattern matching.

Is it possible to define your own Persistent / Esqueleto lens?

Given the following persistent type:
share
[mkPersist sqlSettings, mkMigrate "migrateAll"]
[persistLowerCase|
Account
email Text
passphrase Text
firstName Text
lastName Text
deriving Eq Show Generic
|]
What I think is a kind of lens is generated, ie AccountEmail, AccountPassphrase etc etc. Is it possible to combine these in such a way, not necessarily composition but say string concatenation, I often find myself writing these kinds of functions:
accountFullName :: SqlExpr (Entity Account) -> SqlExpr Text
accountFullName acc = acc ^. AccountFirstName ++. val " " ++. acc ^. AccountLastName
It would be good if I could define this in a similar way to Account* so I can call them using ^. rather than using raw functions, ie acc ^. AccountFullName. This may not be an appropriate way of using these accessors, but if it isn't I would be interested to know why as I feel it may help further my understanding of this part of the Persistent library, as I feel fairly lost when I look at the code around EntityField...

This isn't really possible.
(^.) :: (PersistEntity val, PersistField typ)
=> expr (Entity val)
-> EntityField val typ
-> expr (Value typ)
You'll see that the second argument is EntityField val typ, which is a type family defined in the PersistEntity val class. This is pre-defined for your table, and can't be changed, so this particular operator can't be used for custom accessors the way you want.
When you use persistLowerCase and friends, it uses Template Haskell to parse your definition and generate appropriate data definitions. As I understand it, something similar to the following is generated:
data Account = Account
{ accountEmail :: Text
, accountPassphrase :: Text
, accountFirstName :: Text
, accountLastName :: Text
}
data AccountField a where
AccountId :: AccountField Key
AccountEmail :: AccountField Text
AccountPassphrase :: AccountField Text
AccountFirstName :: AccountField Text
AccountLastName :: AccountField Text
instance PersistEntity Account where
data EntityField Account a = AccountField a
...
(This isn't syntactically accurate and is missing a lot of detail, but it provides enough context for this situation I think.)
So the "lens" you pass to (^.) is actually just a constructor for a type associated with your table type Account. You can't create new constructors or re-associate the type family dynamically, so you can't make something else that can be passed to (^.). These accessors are effectively set in stone.
I think it makes the most sense to just go with the raw function. accountFullName acc isn't bad to write, and it makes it clear that you're doing something a little more complex than just pulling a field value.

Deserializing an existential data type

I need to write a Serialize instance for the following data type:
data AnyNode = forall n . (Typeable n, Serialize n) => AnyNode n
Serializing this is no problem, but I can't implement deserialization, since the compiler has no way to resolve the specific instance of Serialize n, since the n is isolated from the outer scope.
There's been a related discussion in 2006. I am now wondering whether any sort of solution or a workaround has arrived today.

You just tag the type when you serialize, and use a dictionary to untag the type when you deserialize. Here's some pseudocode omitting error checking etc:
serialAnyNode (AnyNode x) = serialize (typeOf n, serialize x)
deserialAnyNode s = case deserialize s of
(typ,bs) -> case typ of
"String" -> AnyNode (deserialize bs :: String)
"Int" -> AnyNode (deserialize bs :: Int)
....
Note that you can only deserialize a closed universe of types with your function. With some extra work, you can also deserialize derived types like tuples, maybes and eithers.
But if I were to declare an entirely new type "Gotcha" deriving Typeable and Serialize, deserialAnyNode of course couldn't deal with it without extension.

You need to have some kind of centralised "registry" of deserialization functions so you can dispatch on the actual type (extracted from the Typeable information). If all types you want to deserialize are in the same module this is pretty easy to set up. If they are in multiple modules you need to have one module that has the mapping.
If your collection of types is more dynamic and not easily available at compile time, you can perhaps use the dynamic linking to gain access to the deserializers. For each type that you want to deserialize you export a C callable function with a name derived from the Typeable information (you could use TH to generate these). Then at runtime, when you want to deserialize a type, generate the same name and the use the dynamic linker to get hold of the address of the function and then an FFI wrapper to get a Haskell callable function. This is a rather involved process, but it can be wrapped up in a library. No, sorry, I don't have such a library.

It's hard to tell what you're asking here, exactly. You can certainly pick a particular type T, deserialize a ByteString to it, and store it in an AnyNode. That doesn't do the user of an AnyNode much good, though -- you still picked T, after all. If it wasn't for the Typeable constraint, the user wouldn't even be able to tell what the type is (so let's get rid of the Typeable constraint because it makes things messier). Maybe what you want is a universal instead of an existential.
Let's split Serialize up into two classes -- call them Read and Show -- and simplify them a bit (so e.g. read can't fail).
So we have
class Show a where show :: a -> String
class Read a where read :: String -> a
We can make an existential container for a Show-able value:
data ShowEx where
ShowEx :: forall a. Show a => a -> ShowEx
-- non-GADT: data ShowEx = forall a. Show a => ShowEx a
But of course ShowEx is isomorphic to String, so there isn't a whole lot point to this. But note that an existential for Read is has even less point:
data ReadEx where
ReadEx :: forall a. Read a => a -> ReadEx
-- non-GADT: data ReadEx = forall a. Read a => ReadEx a
When I give you a ReadEx -- i.e. ∃a. Read a *> a -- it means that you have a value of some type, and you don't know what the type is, but you can a String into another value of the same type. But you can't do anything with it! read only produces as, but that doesn't do you any good when you don't know what a is.
What you might want with Read would be a type that lets the caller choose -- i.e., a universal. Something like
newtype ReadUn where
ReadUn :: (forall a. Read a => a) -> ReadUn
-- non-GADT: newtype ReadUn = ReadUn (forall a. Read a => a)
(Like ReadEx, you could make ShowUn -- i.e. ∀a. Show a => a -- and it would be just as useless.)
Note that ShowEx is essentially the argument to show -- i.e. show :: (∃a. Show a *> a) -> String -- and ReadUn is essentially the return value of read -- i.e. read :: String -> (∀a. Read a => a).
So what are you asking for, an existential or a universal? You can certainly make something like ∀a. (Show a, Read a) => a or ∃a. (Show a, Read a) *> a, but neither does you much good here. The real issue is the quantifier.
(I asked a question a while ago where I talked about some of this in another context.)

Polymorphic Return Types Depending on Context

I'm playing around with implementing a Redis client-library in Haskell and it is my goal to encode, as much as possible, the semantics of the Redis commands in Haskell's type system. Redis, for those who don't know, is a datastore, accessed over the network. I will use it to exemplify my problem, but Redis is not the focus of this question.
An Example Function
Consider the function
get :: (RedisValue a) => Key -> Redis a
get k = decodeValue <$> sendCommand ["GET", key]
It sends a command to the datastore and returns a value stored under the given Key (for this example, you can consider type Key = String). As for the return-type:
Redis is an instance of Monad and MonadIO. It encapsulates information about the network connection. sendCommand sends the request and returns the datastore's reply.
a is polymorphic, for example either Strings or ByteStrings can be returned, depending on the context.
The following code should clarify the text above.
data Redis a = ...
instance MonadIO Redis where ...
instance Monad Redis where ...
sendCommand :: [String] -> Redis String
class RedisValue a where
decodeValue :: String -> a
-- example instances
instance RedisValue String where ...
instance RedisValue ByteString where ...
Different Context, Different Types
Redis supports a simple form of transactions. In a transaction most commands can be sent the same as outside of a transaction. However their execution is delayed until the user sends the commit command (which is called exec in Redis). Inside the transaction, the datastore only returns an acknowledgment that the command is stored for later execution. Upon commit (exec) all results of all stored commands are returned.
This means that the get-function from above looks a bit different in the context of a transaction:
get :: (RedisStatus a) => Key -> RedisTransaction a
get k = decodeStatus <$> sendCommand ["GET", key]
Note that:
The monadic type is now RedisTransaction to indicate the transaction context.
The a return type is now any instance of RedisStatus. There is an overlap between instances of RedisValue and RedisStatus. For example String is in both classes. A specialized Status data type might be only in the RedisStatus class.
The Actual Question
My question is, how can I write a function get that works in both contexts, with context-appropriate return type classes. What I need is
a way to give get a return type "either Redis or RedisTransaction",
The type a to be an instance of RedisValue in the Redis context and an instance of RedisStatus in the RedisTransaction context.
A function decode that automagically does the right thing, depending on the context. I assume this must come from a (multi-param) type class.
If you know how I can do this or have a pointer to some example code or even an article, you will have my thanks!

First, I think that it would be better to have two different get commands. That said, here's an approach.
class RedisGet m a where
get :: Key -> m a
instance (RedisValue a) => RedisGet Redis a where...
instance (RedisStatus a) => RedisGet RedisTransaction a where...
You need
MPTCs, but no FunDeps or Type Families. Every use of get requires that enough information be available to determine both the m and a uniquely.

I agree that multiparameter type classes are a good fit here. Here's an approach:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
newtype Redis a = Redis (IO a) deriving Monad
newtype RedisTransaction a = RedisTransaction (IO a) deriving Monad
newtype Key = Key {unKey :: String}
newtype Value = Value {unValue :: String}
newtype Status = Status {unStatus :: String}
class Monad m => RedisMonad m a | m -> a where
sendCommand :: [String] -> m a
instance RedisMonad Redis Value where
sendCommand = undefined -- TODO: provide implementation
instance RedisMonad RedisTransaction Status where
sendCommand = undefined -- TODO: provide implementation
class Decodable a b where
decode :: a -> b
instance Decodable Status String where
decode = unStatus
instance Decodable Value String where
decode = unValue
get :: (RedisMonad m a, Decodable a b) => Key -> m b
get k = do
response <- sendCommand ["GET", unKey k]
return (decode response)
Note the use of the type isomorphisms for Value and Status: it makes things slightly stronger typed as the Strings you are having produced by your implementations of sendCommand obviously are not just arbitrary sequences of characters but instead adhere to some fixed formats for return values and statuses.

Keep in mind that there's nothing special about a type depending on context--that happens all the time with type inference. The type of [] is [a], but when you use it in something like True : [] the type will be specialized to [Bool] in context.
What changes matters is if you want the implementation of a function, or the definition of a value, to depend on its type. If that type is then inferred from context in the normal way, you end up with a function that does something "different" depending on context. Type-dependent implementation is the main purpose of using type classes.
Now, to answer your specific questions:
a way to give get a return type "either Redis or RedisTransaction",
This requires only a variable in the type signature of get, e.g. get :: Key -> f a. The f will be filled in as either Redis or RedisTransaction depending on context.
The type a to be an instance of RedisValue in the Redis context and an instance of RedisStatus in the RedisTransaction context.
Since both a and the context type will be inferred from use, what you're really after here is restricting the possible types, which amounts to expecting a type-checking error if they don't match up. This is another purpose of type classes, and could be implemented with an appropriate class constraint on the context type variable:
get :: (ContextValue (f a)) => Key -> f a
class ContextValue a
instance (RedisValue a) => ContextValue (Redis a)
instance (RedisStatus a) => ContextValue (RedisTransaction a)
Or something like that. But that alone isn't enough for your purposes, because...
A function decode that automagically does the right thing, depending on the context. I assume this must come from a (multi-param) type class.
This implies selecting an implementation for decode based on the type, which means making it part of a type class such as the above ContextValue. How you handle this depends on what the type for decode needs to be--if the result type needs to be something like f String -> f a where f is the monadic context, then you'll probably need something a bit more elaborate, like in dblhelix's answer. If you only need String -> f a, then you could add it to the above ContextValue class directly.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Code reuse in Haxl - avoiding GADT constructor-per-request-type - haskell

Related

How to obtain a Data.Data.Constr etc. from a Type Representation?

How can I concisely represent a heterogenous sum type in Haskell?

Is it possible to define your own Persistent / Esqueleto lens?

Deserializing an existential data type

Polymorphic Return Types Depending on Context

Categories

Resources