Polymorphic Return Types Depending on Context - haskell

I'm playing around with implementing a Redis client-library in Haskell and it is my goal to encode, as much as possible, the semantics of the Redis commands in Haskell's type system. Redis, for those who don't know, is a datastore, accessed over the network. I will use it to exemplify my problem, but Redis is not the focus of this question.
An Example Function
Consider the function
get :: (RedisValue a) => Key -> Redis a
get k = decodeValue <$> sendCommand ["GET", key]
It sends a command to the datastore and returns a value stored under the given Key (for this example, you can consider type Key = String). As for the return-type:
Redis is an instance of Monad and MonadIO. It encapsulates information about the network connection. sendCommand sends the request and returns the datastore's reply.
a is polymorphic, for example either Strings or ByteStrings can be returned, depending on the context.
The following code should clarify the text above.
data Redis a = ...
instance MonadIO Redis where ...
instance Monad Redis where ...
sendCommand :: [String] -> Redis String
class RedisValue a where
decodeValue :: String -> a
-- example instances
instance RedisValue String where ...
instance RedisValue ByteString where ...
Different Context, Different Types
Redis supports a simple form of transactions. In a transaction most commands can be sent the same as outside of a transaction. However their execution is delayed until the user sends the commit command (which is called exec in Redis). Inside the transaction, the datastore only returns an acknowledgment that the command is stored for later execution. Upon commit (exec) all results of all stored commands are returned.
This means that the get-function from above looks a bit different in the context of a transaction:
get :: (RedisStatus a) => Key -> RedisTransaction a
get k = decodeStatus <$> sendCommand ["GET", key]
Note that:
The monadic type is now RedisTransaction to indicate the transaction context.
The a return type is now any instance of RedisStatus. There is an overlap between instances of RedisValue and RedisStatus. For example String is in both classes. A specialized Status data type might be only in the RedisStatus class.
The Actual Question
My question is, how can I write a function get that works in both contexts, with context-appropriate return type classes. What I need is
a way to give get a return type "either Redis or RedisTransaction",
The type a to be an instance of RedisValue in the Redis context and an instance of RedisStatus in the RedisTransaction context.
A function decode that automagically does the right thing, depending on the context. I assume this must come from a (multi-param) type class.
If you know how I can do this or have a pointer to some example code or even an article, you will have my thanks!

First, I think that it would be better to have two different get commands. That said, here's an approach.
class RedisGet m a where
get :: Key -> m a
instance (RedisValue a) => RedisGet Redis a where...
instance (RedisStatus a) => RedisGet RedisTransaction a where...
You need
MPTCs, but no FunDeps or Type Families. Every use of get requires that enough information be available to determine both the m and a uniquely.

I agree that multiparameter type classes are a good fit here. Here's an approach:
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FunctionalDependencies #-}
{-# LANGUAGE MultiParamTypeClasses #-}
newtype Redis a = Redis (IO a) deriving Monad
newtype RedisTransaction a = RedisTransaction (IO a) deriving Monad
newtype Key = Key {unKey :: String}
newtype Value = Value {unValue :: String}
newtype Status = Status {unStatus :: String}
class Monad m => RedisMonad m a | m -> a where
sendCommand :: [String] -> m a
instance RedisMonad Redis Value where
sendCommand = undefined -- TODO: provide implementation
instance RedisMonad RedisTransaction Status where
sendCommand = undefined -- TODO: provide implementation
class Decodable a b where
decode :: a -> b
instance Decodable Status String where
decode = unStatus
instance Decodable Value String where
decode = unValue
get :: (RedisMonad m a, Decodable a b) => Key -> m b
get k = do
response <- sendCommand ["GET", unKey k]
return (decode response)
Note the use of the type isomorphisms for Value and Status: it makes things slightly stronger typed as the Strings you are having produced by your implementations of sendCommand obviously are not just arbitrary sequences of characters but instead adhere to some fixed formats for return values and statuses.

Keep in mind that there's nothing special about a type depending on context--that happens all the time with type inference. The type of [] is [a], but when you use it in something like True : [] the type will be specialized to [Bool] in context.
What changes matters is if you want the implementation of a function, or the definition of a value, to depend on its type. If that type is then inferred from context in the normal way, you end up with a function that does something "different" depending on context. Type-dependent implementation is the main purpose of using type classes.
Now, to answer your specific questions:
a way to give get a return type "either Redis or RedisTransaction",
This requires only a variable in the type signature of get, e.g. get :: Key -> f a. The f will be filled in as either Redis or RedisTransaction depending on context.
The type a to be an instance of RedisValue in the Redis context and an instance of RedisStatus in the RedisTransaction context.
Since both a and the context type will be inferred from use, what you're really after here is restricting the possible types, which amounts to expecting a type-checking error if they don't match up. This is another purpose of type classes, and could be implemented with an appropriate class constraint on the context type variable:
get :: (ContextValue (f a)) => Key -> f a
class ContextValue a
instance (RedisValue a) => ContextValue (Redis a)
instance (RedisStatus a) => ContextValue (RedisTransaction a)
Or something like that. But that alone isn't enough for your purposes, because...
A function decode that automagically does the right thing, depending on the context. I assume this must come from a (multi-param) type class.
This implies selecting an implementation for decode based on the type, which means making it part of a type class such as the above ContextValue. How you handle this depends on what the type for decode needs to be--if the result type needs to be something like f String -> f a where f is the monadic context, then you'll probably need something a bit more elaborate, like in dblhelix's answer. If you only need String -> f a, then you could add it to the above ContextValue class directly.

Related

Clarification on Existential Types in Haskell

I am trying to understand Existential types in Haskell and came across a PDF http://www.ii.uni.wroc.pl/~dabi/courses/ZPF15/rlasocha/prezentacja.pdf
Please correct my below understandings that I have till now.
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
My Doubts
In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so? When I am drafting a Function I exactly know what kind of buffer I gonna use eventhough I may not know what data I gonna put into that.
What's wrong in Having :: Worker MemoryBuffer Int If they really want to abstract over Buffer they can have a Sum type data Buffer = MemoryBuffer | NetBuffer | RandomBuffer and have a type like :: Worker Buffer Int
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?
GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
I think there's general agreement that the GADT syntax is better. I wouldn't say that it's because GADTs provide implicit foralls, but rather because the original syntax, enabled with the ExistentialQuantification extension, is potentially confusing/misleading. That syntax, of course, looks like:
data SomeType = forall a. SomeType a
or with a constraint:
data SomeShowableType = forall a. Show a => SomeShowableType a
and I think the consensus is that the use of the keyword forall here allows the type to be easily confused with the completely different type:
data AnyType = AnyType (forall a. a) -- need RankNTypes extension
A better syntax might have used a separate exists keyword, so you'd write:
data SomeType = SomeType (exists a. a) -- not valid GHC syntax
The GADT syntax, whether used with implicit or explicit forall, is more uniform across these types, and seems to be easier to understand. Even with an explicit forall, the following definition gets across the idea that you can take a value of any type a and put it inside a monomorphic SomeType':
data SomeType' where
SomeType' :: forall a. (a -> SomeType') -- parentheses optional
and it's easy to see and understand the difference between that type and:
data AnyType' where
AnyType' :: (forall a. a) -> AnyType'
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
I guess these aren't too far off, though you don't have to use Typeable or Data to use existential types. I think it would be more accurate to say an existential type provides a well-typed "box" around an unspecified type. The box does "hide" the type in a sense, which allows you to make a heterogeneous list of such boxes, ignoring the types they contain. It turns out that an unconstrained existential, like SomeType' above is pretty useless, but a constrained type:
data SomeShowableType' where
SomeShowableType' :: forall a. (Show a) => a -> SomeShowableType'
allows you to pattern match to peek inside the "box" and make the type class facilities available:
showIt :: SomeShowableType' -> String
showIt (SomeShowableType' x) = show x
Note that this works for any type class, not just Typeable or Data.
With regard to your confusion about page 20 of the slide deck, the author is saying that it's impossible for a function that takes an existential Worker to demand a Worker having a particular Buffer instance. You can write a function to create a Worker using a particular type of Buffer, like MemoryBuffer:
class Buffer b where
output :: String -> b -> IO ()
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
instance Buffer MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
but if you write a function that takes a Worker as argument, it can only use the general Buffer type class facilities (e.g., the function output):
doWork :: Worker Int -> IO ()
doWork (Worker b x) = output (show x) b
It can't try to demand that b be a particular type of buffer, even via pattern matching:
doWorkBroken :: Worker Int -> IO ()
doWorkBroken (Worker b x) = case b of
MemoryBuffer -> error "try this" -- type error
_ -> error "try that"
Finally, runtime information about existential types is made available through implicit "dictionary" arguments for the typeclasses that are involved. The Worker type above, in addtion to having fields for the buffer and input, also has an invisible implicit field that points to the Buffer dictionary (somewhat like v-table, though it's hardly huge, as it just contains a pointer to the appropriate output function).
Internally, the type class Buffer is represented as a data type with function fields, and instances are "dictionaries" of this type:
data Buffer' b = Buffer' { output' :: String -> b -> IO () }
dBuffer_MemoryBuffer :: Buffer' MemoryBuffer
dBuffer_MemoryBuffer = Buffer' { output' = undefined }
The existential type has a hidden field for this dictionary:
data Worker' x = forall b. Worker' { dBuffer :: Buffer' b, buffer' :: b, input' :: x }
and a function like doWork that operates on existential Worker' values is implemented as:
doWork' :: Worker' Int -> IO ()
doWork' (Worker' dBuf b x) = output' dBuf (show x) b
For a type class with only one function, the dictionary is actually optimized to a newtype, so in this example, the existential Worker type includes a hidden field that consists of a function pointer to the output function for the buffer, and that's the only runtime information needed by doWork.
In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so?
Because Worker, as defined, takes only one argument, the type of the "input" field (type variable x). E.g. Worker Int is a type. The type variable b, instead, is not a parameter of Worker, but is a sort of "local variable", so to speak. It can not be passed as in Worker Int String -- that would trigger a type error.
If we instead defined:
data Worker x b = Worker {buffer :: b, input :: x}
then Worker Int String would work, but the type is no longer existential -- we now always have to pass the buffer type as well.
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?
This is roughly correct. Briefly put, each time you apply constructor Worker, GHC infers the b type from the arguments of Worker, and then searches for an instance Buffer b. If that is found, GHC includes an additional pointer to the instance in the object. In its simplest form, this is not too different from the "pointer to vtable" which is added to each object in OOP when virtual functions are present.
In the general case, it can be much more complex, though. The compiler might use a different representation and add more pointers instead of a single one (say, directly adding the pointers to all the instance methods), if that speeds up code. Also, sometimes the compiler needs to use multiple instances to satisfy a constraint. E.g., if we need to store the instance for Eq [Int] ... then there is not one but two: one for Int and one for lists, and the two needs to be combined (at run time, barring optimizations).
It is hard to guess exactly what GHC does in each case: that depends on a ton of optimizations which might or might not trigger.
You could try googling for the "dictionary based" implementation of type classes to see more about what's going on. You can also ask GHC to print the internal optimized Core with -ddump-simpl and observe the dictionaries being constructed, stored, and passed around. I have to warn you: Core is rather low level, and can be hard to read at first.

Resolving an ambiguous type variable

I have these two functions:
load :: Asset a => Reference -> IO (Maybe a)
send :: Asset a => a -> IO ()
The Asset class look like this:
class (Typeable a,ToJSON a, FromJSON a) => Asset a where
ref :: a -> Reference
...
The first reads an asset from disk, and the second transmits a JSON representation to a WebSocket. In isolation they work fine, but when I combine them the compiler cannot deduce what concrete type a should be. (Could not deduce (Asset a0) arising from a use of 'load')
This makes sense, I have not given a concrete type and both load and send are polymorphic. Somehow the compiler has to decide which version of send (and by extension what version of toJSON) to use.
I can determine at run time what the concrete type of a is. This information is actually encoded both in the data on the disk and the Reference type, but I do not know for sure at compile time as the type checker is being run.
Is there a way to pass the correct type at run time an still keep the type checker happy?
Additional Information
The definition of Reference
data Reference = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
References are derived by parsing a request from a WebSocket as follows where Parser comes from the Parsec library.
reference :: Parser Reference
reference = do
t <- string "User"
<|> string "Port"
<|> string "Model"
<|> ...
char '-'
i <- int
return Ref {assetType = t, assetIndex =i}
If I added a type parameter to Reference I simply push my problem back into the parser. I still need to turn a string that I do not know at compile time into a type to make this work.
You can't make a function that turns string data into values of different types depending on what is in the string. That's simply impossible. You need to rearrange things so that your return-type doesn't depend on the string contents.
Your type for load, Asset a => Reference -> IO (Maybe a) says "pick any a (where Asset a) you like and give me a Reference, and I'll give you back an IO action that produces Maybe a". The caller picks the type they expect to be loaded by the reference; the contents of the file do not influence which type is loaded. But you don't want it to be chosen by the caller, you want it to be chosen by what's stored on disk, so the type signature simply doesn't express the operation you actually want. That's your real problem; the ambiguous type variable when combining load and send would be easily resolved (with a type signature or TypeApplications) if load and send were individually correct and combining them was the only problem.
Basically you can't just have load return a polymorphic type, because if it does then the caller gets to (must) decide what type it returns. There's two ways to avoid this that are more-or-less equivalent: return an existential wrapper, or use rank 2 types and add a polymorphic handler function (continuation) as a parameter.
Using an existential wrapper (requires GADTs extension), it looks something like this:
data SomeAsset
where Some :: Asset a => a -> SomeAsset
load :: Reference -> IO (Maybe SomeAsset)
Notice load is no longer polymorphic. You get a SomeAsset that (as far as the type checker is concerned) could contain any type that has an Asset instance. load can internally use whatever logic it wants split into multiple branches and come up with values of different types of asset on different branches; provided each branch ends with wrapping up the asset value with the SomeAsset constructor all of the branches will return the same type.
To send it, you would use something like (ignoring that I'm not handling Nothing):
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just someAsset <- load ref
case someAsset
of SomeAsset asset -> send asset
The SomeAsset wrapper guarantees that Asset holds for its wrapped value, so you can unwrap them and call any Asset-polymorphic function on the result. However you can never do anything with the value that depends on the specific type in any other way1, which is why you have to keep it wrapped up and case match on it all the time; if the case expression results in a type that depends on the contained type (such as case someAsset of SomeAsset a -> a) the compiler will not accept your code.
The other way is to instead use RankNTypes and give load a type like this:
load :: (forall a. Asset a => a -> r) -> Reference -> IO (Maybe r)
Here load doesn't return a value representing the loaded asset at all. What it does instead is take a polymorphic function as an argument; the function works on any Asset and returns a type r (that was chosen by load's caller), so again load can internally branch however it wants and construct differently-typed assets in the different branches. The different asset types can all be passed to the handler, so the handler can be called in every branch.
My preference is often to use the SomeAsset approach, but then to also use RankNTypes and define a helper function like:
withSomeAsset :: (forall a. Asset a => a -> r) -> (SomeAsset -> r)
withSomeAsset f (SomeAsset a) = f a
This avoids having to restructure your code into continuation passing style, but takes away the heave case syntax everywhere you need to use a SomeAsset:
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just asset <- load ref
withSomeAsset send asset
Or even add:
sendSome = withSomeAsset send
Daniel Wagner suggested adding the type parameter to Reference, which the OP objected to by stating that simply moves the same problem to when the references are constructed. If the references contain data representing which type of asset they refer to, then I would strongly recommend taking Daniel's advice, and using the concepts described in this answer to address that problem at the reference-constructing level. Reference having a type parameter prevents mixing up references to the wrong types of assets where you do know the type.
And if you do significant processing with references and assets of the same type, then having the type parameter in your workhorse code can catch easy mistakes mixing them up even if you usually existential the type away at the outer levels of code.
1 Technically your Asset implies Typeable, so you can test it for specific types and then return those.
Sure, make Reference store the type.
data Reference a where
UserRef :: Int -> Reference User
PortRef :: Int -> Reference Port
ModelRef :: Int -> Reference Model
load :: Asset a => Reference a -> IO (Maybe a)
send :: Asset a => a -> IO ()
If necessary, you can still recover the strong points of your original Reference type by existentially boxing it.
data SomeAsset f where SomeAsset :: Asset a => f a -> SomeAsset f
reference :: Parser (SomeAsset Reference)
reference = asum
[ string "User" *> go UserRef
, string "Port" *> go PortRef
, string "Model" *> go ModelRef
]
where
go :: Asset a => (Int -> Parser (Reference a)) -> Parser (SomeAsset Reference)
go constructor = constructor <$ char '-' <*> int
loadAndSend :: SomeAsset Reference -> IO ()
loadAndSend (SomeAsset reference) = load reference >>= traverse_ send
After reviewing the answers from Daniel Wagner and Ben, I ultimately resolved my issue using a combination of the two which I place here in hopes it will aid others.
First, per Daniel Wagner's answer, I added a phantom type to Reference:
data Reference a = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
I chose not to use a GADT constructors and leave the string reference to assetType as I frequently send references over the wire and/or parse them from incoming text. I felt there were too many code points where I needed a generic reference. For those cases, I fill in the phantom type with Void:
{-# LANGUAGE EmptyDataDecls #-}
data Void
-- make this reference Generic
voidRef :: Reference a -> Reference Void
castRef :: a -> Reference b -> Reference a
-- ^^^ Note this can be undefined used only for its type
With this the load type signature becomes load :: Asset a => Reference a -> IO (Maybe a) So the Asset is always matches the type of the Reference. (Yay type safety!)
That still doesn't address how to load a generic reference. For those cases, I wrote some new code using the second half of Ben's answer. By wrapping the asset in SomeAsset, I can return a Type which is making the type checker happy.
{-# LANGUAGE GADTs #-}
import Data.Aeson (encode)
loadGenericAsset :: Reference Void -> IO SomeAsset
loadGenericAsset ref =
case assetType ref of
"User" -> Some <$> load (castRef (undefined :: User) ref)
"Port" -> Some <$> load (castRef (undefined :: Port) ref)
[etc...]
send :: SomeAsset -> IO ()
send (Some a) = writeToUser (encode a)
data SomeAsset where
Some :: Asset a => a -> SomeAsset

Type constraints are ending up ambiguous

In a Haskell app I'm working on, I have an API in which I'm trying to set up a pluggable set of backends. I'll have several different backend types and I want the caller (right now, just the test suite) to determine the actual backends. However, I'm getting an ambiguous type error.
class HasJobQueue ctx queue where
hasJobQueue :: JobQueue queue => ctx -> queue
class JobQueue q where
enqueue :: MonadIO m => Command -> q -> m ()
type CloisterM ctx queue exc m = ( Monad m, MonadIO m, MonadError exc m, MonadReader ctx m
, AsCloisterExc exc
, HasJobQueue ctx queue
, JobQueue queue
)
createDocument :: forall ctx queue exc m. CloisterM ctx queue exc m => Path -> Document -> m DocumentAddr
createDocument path document = do
...
queue <- hasJobQueue <$> ask
enqueue (SaveDocument addr document) queue
...
So, to me this seems pretty clear. In createDocument, I'm wanting to retrieve the context, and from that retrieve the job queue, which the caller would define and attach to the context. But Haskell disagrees and gives me this error:
• Could not deduce (JobQueue q0)
arising from a use of ‘hasJobQueue’
from the context: CloisterM ctx queue exc m
bound by the type signature for:
createDocument :: CloisterM ctx queue exc m =>
Path -> Document -> m DocumentAddr
at src/LuminescentDreams/CloisterDB.hs:32:1-105
The type variable ‘q0’ is ambiguous
• In the first argument of ‘(<$>)’, namely ‘hasJobQueue’
Here is an example of what I am trying to build, this one from my API test suite where I am mocking all of the backends with simple IORefs, where production will have other backend implementations
data MemoryCloister = MemoryCloister WorkBuffer
newtype WorkBuffer = WorkBuffer (IORef [WorkItem Command])
instance JobQueue WorkBuffer where
hasJobQueue (MemoryCloister wb) = wb
instance JobQueue WorkBuffer where
...
So, what exactly do I need to do to help the type checker understand that the context in the MonadReader contains an object that implements the JobQueue class?
The entire data types file, including how I ultimately reformulated the JobQueue for something more flexible than even the above, is in this project
While it’s difficult to know exactly what the right solution to your problem is based on the code and context given, the error you’re seeing stems from the HasJobQueue typeclass, which is extremely general:
class HasJobQueue ctx queue where
hasJobQueue :: JobQueue queue => ctx -> queue
From the typechecker’s point of view, hasJobQueue is a function from a -> b, plus a few constraints (but constraints do not generally affect type inference). This means that, in order to invoke hasJobQueue, both its input and its output must be entirely unambiguously specified by some other source of type information.
If this this confusing, consider a slightly different class that is nearly identical to the typechecker:
class Convert a b where
convert :: a -> b
This typeclass is generally an antipattern (precisely because it makes type inference very hard), but it could theoretically be used to provide instances to convert between any two types. For example, one could write the following instance:
instance Convert Integer String where
convert = show
…then use convert to convert an integer to a string:
ghci> convert (42 :: Integer) :: String
"42"
However, note that the following will not work:
ghci> convert (42 :: Integer)
<interactive>:26:1: error:
• Ambiguous type variable ‘a0’ arising from a use of ‘print’
prevents the constraint ‘(Show a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
The issue here is that GHC has no idea what b should be, so it can’t pick which Convert instance to use.
In your code, hasJobQueue is much the same, though the details are a little more complex. The issue arises in the following lines:
queue <- hasJobQueue <$> ask
enqueue (SaveDocument addr document) queue
In order to know which HasJobQueue instance to use, GHC needs to know the type of queue. Well, fortunately, GHC can infer types for bindings based on how they are used, so hopefully, queue’s type can be inferred. It is provided as the second argument to enqueue, so we can understand what’s going on by looking at the type of enqueue:
enqueue :: (JobQueue q, MonadIO m) => Command -> q -> m ()
And here we see the problem. The second argument of enqueue must have type q, which is also unconstrained, so GHC does not gain any additional information. Therefore, it can’t determine the type of q, and it doesn’t know which instance to use for either the call to hasJobQueue or the call to enqueue.
So how can you solve this? Well, one way would be to pick a particular type for queue, but based on your code, I’m betting that isn’t actually what you want. More likely, there is a particular type of queue associated with each particular ctx, so the return type of hasJobQueue should really be implied by its first argument. Fortunately, Haskell has a notion to encode this very thing, and that notion is functional dependencies.
Remember that I said at the beginning that constraints do not generally affect type inference? Functional dependencies change that. When you write a fundep, you state that the typechecker actually can gain information from a constraint because some of the type variables imply some of the others. In this case, you want queue to be implied by ctx, so you can change the definition of HasJobQueue:
class HasJobQueue ctx queue | ctx -> queue where
hasJobQueue :: JobQueue queue => ctx -> queue
The | ctx -> queue syntax can be read as “ctx implies queue”.
Now, when you write hasJobQueue <$> ask, GHC already knows ctx, and it knows it can figure out queue from ctx. Therefore, the code is no longer ambiguous, and it can pick the correct instances.
Of course, nothing is for free. The functional dependency is nice, but what do we give up? Well, it means we promise that, for each ctx, there is exactly one queue, no more. Without the functional dependency, both of these instances can coexist:
instance HasJobQueue FooCtx MyQueueA
instance HasJobQueue FooCtx MyQueueB
These are totally legal, and GHC will pick the instance based on which type of queue is requested by the calling code. With the functional dependency, this is illegal, which makes sense—the whole point is that the second parameter must be implied by the first, and if two different options are possible, GHC can’t disambiguate by the first parameter alone.
In this sense, functional dependencies allow typeclass constraints to have “input” and “output” parameters. Sometimes, functional dependencies are called “type-level Prolog”, since they turn the constraint solver into a relational sublanguage. This is extremely powerful, and you can even write classes with bidirectional relations:
class Add a b c | a b -> c, a c -> b, b c -> a
Usually, though, most uses of functional dependencies involve cases like the one you ran into, where one structure semantically “has” an associated type. For example, one of the classic examples come from the mtl library, which use functional dependencies to represent reader context, writer state, etc.:
class MonadReader r m | m -> r
class MonadWriter w m | m -> w
class MonadState s m | m -> s
class MonadError e m | m -> e
This means they can be equivalently expressed in a slightly different way using associated types (part of the TypeFamilies extension)… but that is probably outside the scope of this answer.

Statically enforcing that two objects were created from the same (Int) "seed"

In a library I'm working on, I have an API similar to the following:
data Collection a = Collection Seed {-etc...-}
type Seed = Int
newCollection :: Seed -> IO (Collection a)
newCollection = undefined
insert :: a -> Collection a -> IO () -- ...and other mutable set-like functions
insert = undefined
mergeCollections :: Collection a -> Collection a -> IO (Collection a)
mergeCollections (Collection s0 {-etc...-}) (Collection s1 {-etc...-})
| s0 /= s1 = error "This is invalid; how can we make it statically unreachable?"
| otherwise = undefined
I'd like to be able to enforce that the user cannot call mergeCollections on Collections created with different Seed values.
I thought of trying to tag Collection with a type-level natural: I think this would mean that the Seed would have to be statically known at compile time, but my users might be getting it from an environment variable or user input, so I don't think that would work.
I also hoped I might be able to do something like:
newtype Seed u = Seed Int
newSeed :: Int -> Seed u
newCollection :: Seed u -> IO (Collection u a)
mergeCollections :: Collection u a -> Collection u a -> IO (Collection u a)
Where somehow a Seed is tagged with a unique type in some way, such that the type system could track that both arguments to merge were created from the seed returned by the same invocation of newSeed. To be clear in this (hand-wavy) scheme a and b here would somehow not unify: let a = newSeed 1; b = newSeed 1;.
Is something like this possible?
Examples
Here are some examples of ways I can imagine users creating Seeds and Collections. Users would like to use the other operations (inserting, merging, etc) as freely as they could with any other IO mutable collection:
We only need one seed for all Collections (dynamically) created during the program, but the user must be able to specify in some way how the seed should be determined from the environment at runtime.
One or more static keys gathered from environment vars (or command line args):
main = do
s1 <- getEnv "SEED1"
s2 <- getEnv "SEED2"
-- ... many Collections may be created dynamically from these seeds
-- and dynamically merged later
Probably not in a convenient way. For handling seeds that are known only at runtime, you can use existential types; but then you cannot statically check that two of these existentially wrapped collections match up. The much simpler solution is simply this:
merge :: Collection a -> Collection a -> IO (Maybe (Collection a))
On the other hand, if it is okay to force all operations to be done "together", in a sense, then you can do something like what the ST monad does: group all the operations together, then supply an operation for "running" all the operations that only works if the operations don't leak collections by demanding they be perfectly polymorphic over a phantom variable, hence that the return type doesn't mention the phantom variable. (Tikhon Jelvis also suggests this in his comments.) Here's how that might look:
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
module Collection (Collection, COp, newCollection, merge, inspect, runCOp) where
import Control.Monad.Reader
type Seed = Int
data Collection s a = Collection Seed
newtype COp s a = COp (Seed -> a) deriving (Functor, Applicative, Monad, MonadReader Seed)
newCollection :: COp s (Collection s a)
newCollection = Collection <$> ask
merge :: Collection s a -> Collection s a -> COp s (Collection s a)
merge l r = return (whatever l r) where
whatever = const
-- just an example; substitute whatever functions you want to have for
-- consuming Collections
inspect :: Collection s a -> COp s Int
inspect (Collection seed) = return seed
runCOp :: (forall s. COp s a) -> Seed -> a
runCOp (COp f) = f
Note in particular that the COp and Collection constructors are not exported. Consequently we need never fear that a Collection will escape its COp; runCOp newCollection is not well-typed (and any other operation that tries to "leak" the collection to the outside world will have the same property). Therefore it will not be possible to pass a Collection constructed with one seed to a merge operating in the context of another seed.
I believe this is impossible with the constraint that the seeds come from runtime values, like user input. The typechecker as a tool can only reject invalid programs if we can determine the program is invalid at compiletime. Supposing there is a type such that the typechecker is able to reject programs based on user input, we could deduce that the typechecker is doing some sort of time travel or is able to wholly simulate our deterministic universe. The best you can do as a library author is to smuggle your types into something like ExceptT, which documents the seed precondition and exports awareness for it.

Deserializing an existential data type

I need to write a Serialize instance for the following data type:
data AnyNode = forall n . (Typeable n, Serialize n) => AnyNode n
Serializing this is no problem, but I can't implement deserialization, since the compiler has no way to resolve the specific instance of Serialize n, since the n is isolated from the outer scope.
There's been a related discussion in 2006. I am now wondering whether any sort of solution or a workaround has arrived today.
You just tag the type when you serialize, and use a dictionary to untag the type when you deserialize. Here's some pseudocode omitting error checking etc:
serialAnyNode (AnyNode x) = serialize (typeOf n, serialize x)
deserialAnyNode s = case deserialize s of
(typ,bs) -> case typ of
"String" -> AnyNode (deserialize bs :: String)
"Int" -> AnyNode (deserialize bs :: Int)
....
Note that you can only deserialize a closed universe of types with your function. With some extra work, you can also deserialize derived types like tuples, maybes and eithers.
But if I were to declare an entirely new type "Gotcha" deriving Typeable and Serialize, deserialAnyNode of course couldn't deal with it without extension.
You need to have some kind of centralised "registry" of deserialization functions so you can dispatch on the actual type (extracted from the Typeable information). If all types you want to deserialize are in the same module this is pretty easy to set up. If they are in multiple modules you need to have one module that has the mapping.
If your collection of types is more dynamic and not easily available at compile time, you can perhaps use the dynamic linking to gain access to the deserializers. For each type that you want to deserialize you export a C callable function with a name derived from the Typeable information (you could use TH to generate these). Then at runtime, when you want to deserialize a type, generate the same name and the use the dynamic linker to get hold of the address of the function and then an FFI wrapper to get a Haskell callable function. This is a rather involved process, but it can be wrapped up in a library. No, sorry, I don't have such a library.
It's hard to tell what you're asking here, exactly. You can certainly pick a particular type T, deserialize a ByteString to it, and store it in an AnyNode. That doesn't do the user of an AnyNode much good, though -- you still picked T, after all. If it wasn't for the Typeable constraint, the user wouldn't even be able to tell what the type is (so let's get rid of the Typeable constraint because it makes things messier). Maybe what you want is a universal instead of an existential.
Let's split Serialize up into two classes -- call them Read and Show -- and simplify them a bit (so e.g. read can't fail).
So we have
class Show a where show :: a -> String
class Read a where read :: String -> a
We can make an existential container for a Show-able value:
data ShowEx where
ShowEx :: forall a. Show a => a -> ShowEx
-- non-GADT: data ShowEx = forall a. Show a => ShowEx a
But of course ShowEx is isomorphic to String, so there isn't a whole lot point to this. But note that an existential for Read is has even less point:
data ReadEx where
ReadEx :: forall a. Read a => a -> ReadEx
-- non-GADT: data ReadEx = forall a. Read a => ReadEx a
When I give you a ReadEx -- i.e. ∃a. Read a *> a -- it means that you have a value of some type, and you don't know what the type is, but you can a String into another value of the same type. But you can't do anything with it! read only produces as, but that doesn't do you any good when you don't know what a is.
What you might want with Read would be a type that lets the caller choose -- i.e., a universal. Something like
newtype ReadUn where
ReadUn :: (forall a. Read a => a) -> ReadUn
-- non-GADT: newtype ReadUn = ReadUn (forall a. Read a => a)
(Like ReadEx, you could make ShowUn -- i.e. ∀a. Show a => a -- and it would be just as useless.)
Note that ShowEx is essentially the argument to show -- i.e. show :: (∃a. Show a *> a) -> String -- and ReadUn is essentially the return value of read -- i.e. read :: String -> (∀a. Read a => a).
So what are you asking for, an existential or a universal? You can certainly make something like ∀a. (Show a, Read a) => a or ∃a. (Show a, Read a) *> a, but neither does you much good here. The real issue is the quantifier.
(I asked a question a while ago where I talked about some of this in another context.)

Resources