Associated type family complains about `pred :: T a -> Bool` with "NB: ‘T’ is a type function, and may not be injective" - haskell

This code:
{-# LANGUAGE TypeFamilies #-}
module Study where
class C a where
type T a :: *
pred :: T a -> Bool
— Gives this error:
.../Study.hs:7:5: error:
• Couldn't match type ‘T a’ with ‘T a0’
Expected type: T a -> Bool
Actual type: T a0 -> Bool
NB: ‘T’ is a type function, and may not be injective
The type variable ‘a0’ is ambiguous
• In the ambiguity check for ‘Study.pred’
To defer the ambiguity check to use sites, enable AllowAmbiguousTypes
When checking the class method:
Study.pred :: forall a. A a => T a -> Bool
In the class declaration for ‘A’
|
7 | pred :: T a -> Bool
| ^^^^^^^^^^^^^^^^^^^
Replacing type keyword with data fixes.
Why is one wrong and the other correct?
What do they mean by "may not be injective"? What kind of function that is if it is not even allowed to be one to one? And how is this related to the type of pred?

instance C Int where
type T Int = ()
pred () = False
instance C Char where
type T Char = ()
pred () = True
So now you have two definitions of pred. Because a type family assigns just, well, type synonyms, these two definitions have the signatures
pred :: () -> Bool
and
pred :: () -> Bool
Hm, looks rather similar, doesn't it? The type checker has no way to tell them apart. What, then, is
pred ()
supposed to be? True or False?
To resolve this, you need some explicit way of providing the information of which instance the particular pred in some use case is supposed to belong to. One way to do that, as you've discovered yourself, is to change to an associated data family: data T Int = TInt and data T Char = TChar would be two distinguishable new types, rather than synonyms to types, which have no way to ensure they're actually different. I.e. data families are always injective; type families sometimes aren't. The compiler assumes, in the absence of other hints, that no type family is injective.
You can declare a type family injective with another language extension:
{-# LANGUAGE TypeFamilyDependencies #-}
class C a where
type T a = (r :: *) | r -> a
pred :: T a -> a
The = simply binds the result of T to a name so it is in scope for the injectivity annotation, r -> a, which reads like a functional dependency: the result of T is enough to determine the argument. The above instances are now illegal; type T Int = () and type T Char = () together violate injectivity. Just one by itself is admissible.
Alternatively, you can follow the compiler's hint; -XAllowAmbiguousTypes makes the original code compile. However, you will then need -XTypeApplications to resolve the instance at the use site:
pred #Int () == False
pred #Char () == True

Related

How to EmptyCase at the type level

Using EmptyCase, it is possible to implement the following function:
{-# LANGUAGE EmptyCase, EmptyDataDecls #-}
data Void
absurd :: Void -> a
absurd v = case v of
With DataKinds, data types can be promoted to the kind level (their constructors are promoted to type constructors). This works for uninhabited data types like Void as well.
The question here is whether there is a way to write the equivalent of absurd for an uninhabited kind:
tabsurd :: Proxy (_ :: Void) -> a
tabsurd = _
This would effectively be a form of "EmptyCase at the type level". Within reason, feel free to substitute Proxy with some other suitable type (e.g. TypeRep).
NB: I understand that I can just resort to error or similar unsafe techniques here, but I want to see if there's a way to do this that would not work if the type wasn't uninhabited. So for whatever technique we come up with, it should not be possible to use the same technique to inhabit the following function:
data Unit = Unit
notsoabsurd :: Proxy (_ :: Unit) -> a
notsoabsurd = _
The type-level equivalent of pattern-matching is type classes (well, also type families, but they don't apply here, since you want a term-level result).
So you could conceivably make tabsurd a member of a class that has an associated type of kind Void:
class TAbsurd a where
type TAbsurdVoid a :: Void
tabsurd :: a
Here, tabsurd will have type signature tabsurd :: TAbsurd a => a, but if you insist on a Proxy, you can obviously easily convert one to the other:
pabsurd :: TAbsurd a => Proxy a -> a
pabsurd _ = tabsurd
So calling such function or using it in any other way would presumably be impossible, because you can't implement class TAbsurd a for any a, because you can't provide the type TAbsurdVoid a.
According to your requirement, the same approach does work fine for Unit:
data Unit = Unit
class V a where
type VU a :: Unit
uabsurd :: a
instance V Int where
type VU Int = 'Unit
uabsurd = 42
Keep in mind however that in Haskell, any kind (including Void) is potentially inhabited by a non-terminating type family. For example, this works:
type family F a :: x
instance TAbsurd Int where
type TAbsurdVoid Int = F String
tabsurd = 42
However, this limitation is akin to any type (including Void) being inhabited by the value undefined at term level, so that you can actually call absurd like this:
x = absurd undefined
The difference with type level is that you can actually call function tabsurd (given the instance above) and it will return 42:
print (tabsurd :: Int)
This can be fixed by having tabsurd return not a, but a Proxy (TAbsurdVoid a):
class TAbsurd a where
type TAbsurdVoid a :: Void
tabsurd :: Proxy (TAbsurdVoid a)

Why does a wildcard match work when enumerating all cases doesn't?

Consider this code:
{-# LANGUAGE GADTs #-}
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA PA = True
isA _ = False
It compiles and works fine. Now consider this code:
{-# LANGUAGE GADTs #-}
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA PA = True
isA PB = False
isA PC = False
It fails to compile:
Main.hs:8:10: error:
• Couldn't match expected type ‘p’ with actual type ‘Bool’
‘p’ is untouchable
inside the constraints: t ~ Int
bound by a pattern with constructor: PA :: P Int,
in an equation for ‘isA’
at Main.hs:8:5-6
‘p’ is a rigid type variable bound by
the inferred type of isA :: P t -> p
at Main.hs:(8,1)-(10,14)
Possible fix: add a type signature for ‘isA’
• In the expression: True
In an equation for ‘isA’: isA PA = True
• Relevant bindings include isA :: P t -> p (bound at Main.hs:8:1)
|
8 | isA PA = True
| ^^^^
Main.hs:9:10: error:
• Couldn't match expected type ‘p’ with actual type ‘Bool’
‘p’ is untouchable
inside the constraints: t ~ Double
bound by a pattern with constructor: PB :: P Double,
in an equation for ‘isA’
at Main.hs:9:5-6
‘p’ is a rigid type variable bound by
the inferred type of isA :: P t -> p
at Main.hs:(8,1)-(10,14)
Possible fix: add a type signature for ‘isA’
• In the expression: False
In an equation for ‘isA’: isA PB = False
• Relevant bindings include isA :: P t -> p (bound at Main.hs:8:1)
|
9 | isA PB = False
| ^^^^^
Main.hs:10:10: error:
• Couldn't match expected type ‘p’ with actual type ‘Bool’
‘p’ is untouchable
inside the constraints: t ~ Char
bound by a pattern with constructor: PC :: P Char,
in an equation for ‘isA’
at Main.hs:10:5-6
‘p’ is a rigid type variable bound by
the inferred type of isA :: P t -> p
at Main.hs:(8,1)-(10,14)
Possible fix: add a type signature for ‘isA’
• In the expression: False
In an equation for ‘isA’: isA PC = False
• Relevant bindings include isA :: P t -> p (bound at Main.hs:8:1)
|
10 | isA PC = False
| ^^^^^
Why? What's going on here?
Edit: Adding the type signature isA :: P t -> Bool makes it work, so my question now becomes: why doesn't type inference work in the second case, since it does in the first case?
In typing a case construct (whether an explicit case statement or an implicit pattern-based function definition) in the absence of GADTs, the individual alternatives:
pattern -> body
can be unified by typing all the patterns and unifying those with the type of the scrutinee, then typing all the bodies and unifying those with the type of the case expression as a whole. So, in a simple example like:
data U = UA | UB | UC
isA1 u = case u of
UA -> True
UB -> False
x -> False
we can initially type the patterns UA :: U, UB :: U, x :: a, unify them using the type equality a ~ U to infer the type of the scrutinee u :: U, and similarly unify True :: Bool and both of the False :: Bool to the type of the overall case expression Bool, unifying that with the type of isA to get isA :: U -> Bool.
Note that the process of unification can specialize the types. Here, the type of the pattern x :: a was general, but by the end of the unification process, it had been specialized to x :: U. This can happen with the bodies, too. For example:
len mstr = case mstr of
Nothing -> 0
Just str -> length str
Here, 0 :: Num a => a is polymorphic, but because length returns an Int, by the end of the process, the bodies (and so the entire case expression) have been unified to the type Int.
In general, through unification, the common, unified type of all the bodies (and so the type of the overall case expression) will be the "most general" / "least restrictive" type of which the types of the bodies are all generalizations. In some cases, this type might be the type of one of the bodies, but in general, all the bodies can be more more general than the "most general" unified type, but no body can be more restrictive.
Things change in the presence of GADTs. When type-checking case constructs with GADTs, the patterns in an alternative can introduce a "type refinement", a set of additional bindings of type variables to be used in type-checking the bodies of the alternative. (This is what makes GADTs useful in the first place.)
Because different alternatives' bodies are typed under different refinements, naive unification isn't possible. For example, consider the tiny typed DSL and its interpreter:
data Term a where
Lit :: Int -> Term Int
IsZ :: Term Int -> Term Bool
If :: Term Bool -> Term a -> Term a -> Term a
eval :: Term a -> a
eval t = case t of
Lit n -> n
IsZ t -> eval t == 0
If b t e -> if eval b then eval t else eval e
If we were to naively unify the bodies n :: Int, eval t == 0 :: Bool, and if eval b then eval t else eval e :: a, the program wouldn't type check (most obviously, because Int and Bool don't unify!).
In general, because type refinements allow the calculated types of the alternatives' bodies to be more specific than the final type, there's no obvious "most general" / "least restrictive" type to which all bodies can be unified, as there was for the case expression without GADTs.
Instead, we generally need to make available a "target" type for the overall case expression (e.g., for eval, the return type a in the type signature), and then determine if under each refinement introduced by a constructor (e.g., IsZ introducing the refinement a ~ Bool), the body eval t == 0 :: Bool has as its type the associated refinement of a.
If no target type is explicitly provided, then the best we can do -- in general -- is use a fresh type variable p as the target and try to check each refined type against that.
This means that, given the following definition without a type signature for isA2:
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA2 = \p -> case p of
PA -> True
PB -> False
PC -> False
what GHC tries to do is type isA2 :: P t -> p. For the alternative:
PA -> True
it types PA :: P t giving the refinement t ~ Int, and under this refinement, it tries to type True :: p. Unfortunately, p is not Bool under any refinement involving the unrelated type variable a, and we get an error. Similar errors are generated for the other alternatives, too.
Actually, there's one more thing we can do. If there are alternatives that don't introduce a type refinement, then the calculated types of their bodies are NOT more specific than the final type. So, if we unify the body types for "unrefined" alternatives, the resulting type provides a legitimate unification target for the refined alternatives.
This means that, for the example:
isA3 = \p -> case p of
PA -> True
x -> False
the second alternative:
x -> False
is typed by matching the pattern x :: P t which introduces no type refinement. The unrefined type of the body is Bool, and this type is an appropriate target for unification of the other alternatives.
Specifically, the first alternative:
PA -> True
matches with a type refinement a ~ Int. Under this refinement, the actual type of the body True :: Bool matches the "refinement" of the target type Bool (which is "refined" to Bool), and the alternative is determined to have valid type.
So, the intuition is that, without a wildcard alternative, the inferred type for the case expression is an arbitrary type variable p, which is too general to be unified with the type refining alternatives. However, when you add a wildcard case alternative _ -> False, it introduces a more restrictive body type Bool into the unification process that, having been deduced without any type refinement by the pattern _, can inform the unification algorithm by providing a more restrictive type Bool to which the other, type refined alternatives, can be unified.
Above, I've made it sound like there's some two-phase approach, where the "non-refining" alternatives are examined first to determine a target type, and then the refining alternatives are checked against it.
In fact, what happens is that the refinement process introduces fresh variables into the unification process that, even when they're unified, don't affect the larger type context. So, all the alternatives are unified at once, but unification of the unrefining alternatives affects the larger type context while unification of the refined alternatives affects a bunch of fresh variables, giving the same end result as if the unrefined and refined alternatives were processed separately.
Disclaimer: I write this as an answer because It doesn't fit in a comment. But I might be wrong
This behaviour is the expected when pattern match on GADTs. Up to GHC's User manual:
type refinement is only carried out based on user-supplied type
annotations. So if no type signature is supplied for eval, no type
refinement happens, and lots of obscure error messages will occur
Also from de User Manual:
When pattern-matching against data constructors drawn from a GADT, for
example in a case expression, the following rules apply:
The type of the scrutinee must be rigid.
The type of the entire case expression must be rigid.
The type of any free variable mentioned in any of the case alternatives must be rigid.
Note: a type variable is rigid iff it is specified by the user.
Up to this, when pattern matching against a GADT you must provided the type signature (the reason is that type inference is difficult on GADTs). Hence, apparently the first definition of isA should fail to compile, but in the paper which type inference for GADTs is explained (section 6.4):
We remarked in Section 4.3 that in PCON-R it would be unsound to use
any unifier other than a most-general one. But must the refinement be
a unifier at all? For example, even though the case expression could
do refinement, no refinement is necessary to typecheck this function:
f :: Term a -> Int
f (Lit i) = i
f _ = 0
The above example is exactly your case!. In the paper this is called a pre-unifier and there is a very technical explanation on how this works but As far as I can understand, when writing:
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA PA = True
isA PB = False
isA PC = False
the compiler starts by deducing isA :: P t -> p and refuse to continue, because type variables aren't rigid (i.e. aren't user-specify)
whereas when writing:
data P t where
PA :: P Int
PB :: P Double
PC :: P Char
isA PA = True
isA _ = False
the compiler can deduce that any type inference will be less general than deducing Bool as a returning type, hence It can safely deduce isA :: P t -> Bool.
Probably this seems as obscure to you as to me, but for sure the two cases you ask for are actually documentated, so probably this is the desired behaviour for GHC developers and not a weird bug.

Change variable type to match the expected type

In the following code, I get the error
Couldn't match type 'Integer' with 'Int'
Expected type :[(Test, [Test])]
Actual type : [(Integer, [Integer])]
when executing
testFunc test
with the following declaration
type TestType = Int
a = [(1,[2,3])]
testFunc :: [(TestType ,[TestType])] -> TestType
testFunc ((a,(b:c)):d) = a
How do I declare my list a so that it matches the type of testFunc?
And is there a way to fix the error without modifying type Test = Int or the declaration of a?
How do I declare my list 'test' so that it matches the type of testFunc?
Well, by declaring this as the type.
a :: [(TestType, [TestType])]
a = [(1,[2,3])]
Generally speaking, you should always give explicit type signatures for top-level definitions like this. Without such a signature, the compiler will pick one for you. Generally Haskell tries to pick the most general available type possible; in this case that would be
a :: (Num a, Num b) => [(a, [b])]
...which would include both [(Int, [Int])] and [(Integer, [Integer])]. However, the monomorphism restriction restricts the type by default, excluding such polymorphism. So GHC has to pick one version, and the default one is Integer, not Int.
The right solution, again, is to provide an explicit signature. However, you can also turn off the monomorphism restriction:
{-# LANGUAGE NoMonomorphismRestriction #-}
type TestType = Int
a = [(1,[2,3])]
testFunc :: [(TestType ,[TestType])] -> TestType
testFunc ((x,(y:z)):w) = x
main :: IO ()
main = print $ testFunc a

Difference between type family and partial newtype? (and partial data?)

I've had to interface two libraries where metadata is represented as a type parameter in one and as a record field in the other. I wrote an adaptor using a GADT. Here's a distilled version:
{-# LANGUAGE GADTs #-}
newtype TFId a = MkId a
data TFDup a = MkDup !a !a
data GADT tf where
ConstructorId :: GADT TFId
ConstructorDup :: GADT TFDup
main = do
f ConstructorId
f ConstructorDup
f :: GADT tf -> IO ()
f = _
This works. (May not be perfect; comments welcome, but that's not the question.)
It took me some time to get to this working state. My initial intuition was to use a type family for TFId, figuring: “GADT has kind (* -> *) -> *; in ConstructorDup TFDup has kind * -> *; so for ConstructorId I can use the following * -> * type family:”
{-# LANGUAGE TypeFamilies #-}
type family TFId a where TFId a = a
The type constructor does have the same kind * -> *, but GHC apparently won't have it in the same place:
error: …
The type family ‘TFId’ should have 1 argument, but has been given none
In the definition of data constructor ‘ConstructorId’
In the data type declaration for ‘GADT’
Well, if it says so…
I'm no sure I understand why it would make such a difference. No using type family stems without applying them? What's going on? Any other (better) way to do?
Injectivity.
type family F :: * -> *
type instance F Int = Bool
type instance F Char = Bool
here F Int ~ F Char. However,
data G (a :: *) = ...
will never cause G Int ~ G Char. These are guaranteed to be distinct types.
In universal quantifications like
foo :: forall f a. f a -> a
f is allowed to be G (injective) but not allowed to be F (not injective).
This is to make inference work. foo (... :: G Int) can be inferred to have type Int. foo (... :: F Int) is equivalent to foo (... :: Bool) which may have type Int, or type Char -- it's an ambiguous type.
Also consider foo True. We can't expect GHC to choose f ~ F, a ~ Int (or Char) for us. This would involve looking at all type families and see if Bool can be produced by any on them -- essentially, we would need to invert all the type families. Even if this were feasible, it would generate a huge amount of possible solutions, so it would be ambiguous.

How to tell whether variable is a certain data in Haskell?

Edit: This class instance of QWhere fails when it's passed input like this: >qWhere fly john even though fly is type Argument -> Argument -> Predicate and john is type Argument.
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
data Argument = Argument { ttype :: Type, value :: String } deriving (Show, Eq)
data Predicate = Predicate { lemma :: String, arguments :: [Argument] } deriving (Show, Eq)
class Fly a b where
fly :: a -> b -> Predicate
instance Fly Argument Argument where
fly x y = Predicate { lemma = "fly", arguments = [x, y] }
instance Fly Argument Predicate where
fly x y = Predicate { lemma = "fly", arguments = [x, arguments y !! 0] }
class QWhere a b where
qWhere :: a -> b -> String
instance QWhere (Argument -> Argument -> Predicate) Argument where
qWhere x y = "hi"
This is the output from the ghci:
No instance for (QWhere (a0 -> b0 -> Predicate) Argument)
arising from a use of ‘qWhere’
The type variables ‘a0’, ‘b0’ are ambiguous
Note: there is a potential instance available:
instance QWhere (Argument -> Argument -> Predicate) Argument
-- Defined at new_context.hs:116:10
In the expression: qWhere fly john
In an equation for ‘it’: it = qWhere fly john
No instance for (Fly a0 b0) arising from a use of ‘fly’
The type variables ‘a0’, ‘b0’ are ambiguous
Note: there are several potential instances:
instance Fly Argument Predicate
-- Defined at new_context.hs:110:10
instance Fly Argument Argument
-- Defined at new_context.hs:107:10
In the first argument of ‘qWhere’, namely ‘fly’
In the expression: qWhere fly john
In an equation for ‘it’: it = qWhere fly john
These questions are relevant, but none of the answers have solved my problem.
(1) Checking for a particular data constructor
(2) Test if Haskell variable matches user-defined data type option
And some internet sources which should address this question but I could not find the solution from:
(3) https://www.haskell.org/haskellwiki/Determining_the_type_of_an_expression
(4) http://okmij.org/ftp/Haskell/typeEQ.html
My problem: I have two Haskell data types defined. I am given an input and I need to determine if it belongs to data type A or data type B.
Here is the data types definition:
data Argument = Argument { ttype :: Type, value :: String } deriving (Show, Eq)
data Predicate = Predicate { lemma :: String, arguments :: [Argument] } deriving (Show, Eq)
I need a function which returns true/false if a variable is a data type Argument or Predicate.
I attempted to follow the answers of both SO questions, but only got complaints from the ghci compiler:
--checks if a variable is of data type Argument
--this does not compile (from question (2))
isArgument :: a -> Bool
isArgument (Argument _) = True
isArgument _ = False
--from question (1), also fails
isArgument :: a -> String
isArgument value =
case token of
Argument arg -> "is argument"
Predicate p -> "is predicate"
The sort of dynamic typing you are trying to do is very rarely used in Haskell. If you want to write functions that can take values of both Predicate and Argument there are at least two idiomatic ways depending on your exact use-case.
The first is to overload the function using type-classes. E.g.
class IsArgument a where
isArgument :: a -> Bool
instance IsArgument Argument where
isArgument _ = True
instance IsArgument Predicate where
isArgument _ = False
The second is to use some sum-type such as Either Predicate Argument or a custom sum-type such as:
data Expr = ArgumentExpr Argument | PredicateExpr Predicate
isArgument :: Expr -> Bool
isArgument (ArgumentExpr _) = True
isArgument _ = False
You can also make Argument and Predicate constructors of the same type, but then of course you lose the type safety of treating them as separate types. You can circumvent this by using a GADT and tagging the constructors with a phantom type but this gets into the slightly more advanced type extensions that GHC offers:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE StandaloneDeriving #-}
data ExprType = Argument | Predicate
data Expr t where
ArgumentExpr :: { ttype :: Type, value :: String } -> Expr Argument
PredicateExpr :: { lemma :: String, arguments :: [Expr Argument] } -> Expr Predicate
deriving instance Show (Expr t)
deriving instance Eq (Expr t)
isArgument :: Expr t -> Bool
isArgument (ArgumentExpr {}) = True
isArgument _ = False
Now arguments and predicates are constructors of the same type but you can limit the values to a specific constructor by using the type parameter as is done in arguments :: [Expr Argument] but you can also just accept any expression using the type Expr t as in isArgument.
If you really really need run-time polymorphism, it can be achieved using the Typeable type-class which enables you to get runtime type information and do type-casts on opaque, generic types.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
data Argument = Argument { ttype :: Type, value :: String } deriving (Show, Eq, Typeable)
data Predicate = Predicate { lemma :: String, arguments :: [Argument] } deriving (Show, Eq, Typeable)
isArgument :: Typeable a => a -> Bool
isArgument a = case cast a of
Just (Argument {}) -> True
_ -> False
The function cast tries to convert any Typeable a => a value into some known type and returns a Just value if the type-cast succeeds and Nothing if it fails.
You can do what you want by making Argument and Predicate part of the same type....
data LogicElement = Argument { ttype :: Type, value :: String } |
Predicate { lemma :: String, arguments :: [LogicElement] } deriving (Show, Eq)
While it is possible to define a function of type (a->Bool), it is unusual, and generally implies that the value being input will be ignored (ie- how can you do anything to something if you don't even know what it is? You pretty much can only apply other (a->b) functions on it).
In the particular example, you compiler will complain at the following
isArgument (Argument _) = True
because the pattern implicitly implies that the input argument must be type Argument, whereas the signature you gave was undefined type a.
When you say that isArgument has type a -> Bool, you're saying that it can take any, or rather every possible a, not just certain ones. There are a few solutions to this, though. My preference would be that for simple alternatives, just use Either:
import Data.Either
isArgument :: Either Argument Predicate -> Bool
isArgument = isLeft
Or
-- Note the use of {} instead of _, the {} expands to all fields in a record
-- the _ only would have taken the place of the ttype field, although you could have
-- (Argument _ _) to capture both fields of the Argument constructor
isArgument :: Either Argument Predicate -> Bool
isArgument (Left (Argument {})) = True
isArgument _ = False
Although, the only use of this sort of function would be when you aren't sure which data type you have, and a value in Haskell can't be ambiguously typed. It would be equivalent to if you had in Java/C/C++/C# something like
if (some_condition) {
x = "a string";
} else {
x = 5;
}
These are statically typed languages, a variable can't take on values of two different types. The same holds in Haskell. If you wanted to give a type to a variable that could take on two different values, you'd have to write a container for it. The Either container in Haskell is pre-defined for this purpose.

Resources