What Justification for the type of f x = f x in Haskell is there? - haskell

Haskell gives f x = f x the type of t1 -> t, but could someone explain why?
And, is it possible for any other, nonequivalent function to have this same type?

Okay, starting from the function definition f x = f x, let's step through and see what we can deduce about the type of f.
Start with a completely unspecified type variable, a. Can we deduce more than that? Yes, we observe that f is a function taking one argument, so we can change a into a function between two unknown type variables, which we'll call b -> c. Whatever type b stands for is the type of the argument x, and whatever type c stands for must be the type of the right-hand side of the definition.
What can we figure out about the right-hand side? Well, we have f, which is a recursive reference to the function we're defining, so its type is still b -> c, where both type variables are the same as for the definition of f. We also have x, which is a variable bound within the definition of f and has type b. Applying f to x type checks, because they're sharing the same unknown type b, and the result is c.
At this point everything fits together and with no other restrictions, we can make the type variables "official", resulting in a final type of b -> c where both variables are the usual, implicitly universally quantified type variables in Haskell.
In other words, f is a function that takes an argument of any type and returns a value of any type. How can it return any possible type? It can't, and we can observe that evaluating it produces only an infinite recursion.
For the same reason, any function with the same type will be "equivalent" in the sense of never returning when evaluated.
An even more direct version is to remove the argument entirely:
foo :: a
foo = foo
...which is also universally quantified and represents a value of any type. This is pretty much equivalent to undefined.

f x = undefined
has the (alpha) equivalent type f :: t -> a.
If you're curious, Haskell's type system is derived from Hindley–Milner. Informally, the typechecker starts off with the most permissive types for everything, and unifies the various constraints until what remains is consistent (or not). In this case, the most general type is f :: t1 -> t, and there's no additional constraints.
Compare to
f x = f (f x)
which has inferred type f :: t -> t, due to unifying the types of the argument of f on the LHS and the argument to the outer f on the RHS.

Related

It is possible to define a function in haskell that takes one parameter, ignores it, and returns itself?

I would like to define the following function:
f a = f
This function takes one argument and returns itself ignoring the argument. Just writing it like this in ghci gives me the following type error:
• Couldn't match expected type ‘t’ with actual type ‘p0 -> t’
‘t’ is a rigid type variable bound by
the inferred type of f :: t
If the function would use this argument it would be possible (as in this halt example):
halt = halt
or with one parameter
halt x = halt x
Would it somehow be possible to name this type or compile the former program? Of course, this is not for any specific reason, just trying to understand type theory, specifically in haskell, better - you could never apply this function enough to yield a specific result.
Well, you can "almost" have that.
As already mentioned, you can't have an infinite type like a -> b -> c -> ..., since Haskell types must be finite.
However, we can define a recursive newtype which is "morally" the infinite type above, and use that:
newtype F = F { unF :: forall a. a -> F }
f :: F
f = F (\x -> f)
In simpler terms, this constructs a type F satisfying F ~= forall a . a -> F which is the equation giving rise to the wanted "infinite" type.
Note the use of the forall a to allow x to have any type at all. The (almost-)function f is essentially a "hungry" function that will eat any argument, of any type, and return itself.
There is a downside, though, to this approach. To actually apply f, we need to unwrap the constructor. This leads to code like
g :: F
g = unF (unF f True) "hello"
which is a bit more inconvenient than g = f True "hello".

Multiple types for f in this picture?

https://youtu.be/brE_dyedGm0?t=1362
data T a where
T1 :: Bool -> T Bool
T2 :: T a
f x y = case x of
T1 x -> True
T2 -> y
Simon is saying that f could be typed as T a -> a -> a, but I would think the return value MUST be a Bool since that is an explicit result in a branch of the case expression. This is in regards to Haskell GADTs. Why is this the case?
This is kind of the whole point of GADTs. Matching on a constructor can cause additional type information to come into scope.
Let's look at what happens when GHC checks the following definition:
f :: T a -> a -> a
f x y = case x of
T1 _ -> True
T2 -> y
Let's look at the T2 case first, just to get it out of the way. y and the result have the same type, and T2 is polymorphic, so you can declare its type is also T a. All good.
Then comes the trickier case. Note that I removed the binding of the name x inside the case as the shadowing might be confusing, the inner value isn't used, and it doesn't change anything in the explanation. When you match on a GADT constructor like T1, one that explicitly sets a type variable, it introduces additional constraints inside that branch which add that type information. In this case, matching on T1 introduces a (a ~ Bool) constraint. This type equality says that a and Bool match each other. Therefore the literal True with type Bool matches the a written in the type signature. y isn't used, so the branch is consistent with T a -> a -> a.
So it all matches up. T a -> a -> a is a valid type for that definition. But as Simon is saying, it's ambiguous. T a -> Bool -> Bool is also a valid type for that definition. Neither one is more general than the other, so the definition doesn't have a principle type. So the definition is rejected unless a type is provided, because inference cannot pick a single most-correct type for it.
A value of type T a with a different from Bool can never have the form T1 x (since that has only type T Bool).
Hence, in such case, the T1 x branch in the case becomes inaccessible and can be ignored during type checking/inference.
More concretely: GADTs allow the type checker to assume type-level equations during pattern matching, and exploit such equations later on. When checking
f :: T a -> a -> a
f x y = case x of
T1 x -> True
T2 -> y
the type checker performs the following reasoning:
f :: T a -> a -> a
f x y = case x of
T1 x -> -- assume: a ~ Bool
True -- has type Bool, hence it has also type a
T2 -> -- assume: a~a (pointless)
y -- has type a
Thanks to GADTs, both branches of the case have type a, hence the whole case expression has type a and the function definition type checks.
More generally, when x :: T A and the GADT constructor was defined as K :: ... -> T B then, when type checking we can make the following assumption:
case x of
K ... -> -- assume: A ~ B
Note that A and B can be types involving type variables (as in a~Bool above), so that allows one obtain useful information about them and exploit it later on.

Is anything generative?

In the paper "Higher-order Type-level Programming in Haskell", an f :: Type -> Type is defined to be "generative" in the following way:
Definition (Generativity). f is generative ⇔ f a ~ g b ⇒ f ~ g
I'm going to explicitly write out the intended quantification as I understand it:
type IsGenerative :: (Type -> Type) -> Constraint
class (forall g a b. f a ~ g b => f ~ g) => IsGenerative f
Conversely, in words:
F :: Type -> Type is generative if there is no G :: Type -> Type besides F such that there exist A, B :: Type for which F A ~ G B
The paper goes on to make a statement about the generativity of unsaturated type-families (they're not generative). To my understanding, in order to be able to form the proposition of whether or not unsaturated type-families are generative, the variables f, g :: Type -> Type should range over type-families as well as type constructors. Note that this means the ~ in f ~ g must represent some more abstract sense of definitional equality than GHC's (~) :: (Type -> Type) -> (Type -> Type) -> Constraint, which cannot be applied to unsaturated type families.
Now here's the problem: it doesn't seem like anything is generative. You'd expect that a datatype constructor like Maybe :: Type -> Type would be generative, but I can easily construct a distinct type family G :: Type -> Type and A, B :: Type for which F A ~ G B (despite F /~ G).
type G :: Type -> Type
type family G a
where
G _ = Maybe Int
data Dict c
where
Dict :: c => Dict c
lhs :: Dict (Maybe Int ~ G String)
lhs = Dict
As I said before, we can't actually form the proposition Maybe ~ G within GHC (because G is not saturated), but if F ~ G is taken to mean "F is definitionally equal to G", it's pretty obvious that Maybe /~ G. So it seems like Maybe is not actually generative in the sense defined in the paper. And it seems to me that any data/newtype is susceptible to a similar sequence of reasoning.
So where am I going wrong?
Is my assumption that F, G are allowed to range over type-families as well as type constructors justified? If not, generativity seems like a rather trivial property: "we cannot form the proposition of whether type families are generative, so type families are not generative".
Am I misunderstanding how the variables are quantified in the statement of generativity?
Are there actually any type-level expressions f :: Type -> Type that satisfy the formal property of being generative?
Eh, you're overthinking it. The ~ really is the one from GHC. If you prefer, replace the claim "unsaturated type families are not generative" with "if we expanded ~ to allow unsaturated type families1, then they would not be guaranteed generative2". This latter fact is (part of) the reason we don't bother expanding ~ to allow unsaturated type families -- it would be much less useful for them than it is for other type expressions.
If they were not precise about this divide in the paper, it's just a bit of slightly sloppy writing, such as we've all done at one point or another.
1 You can probably deal with the G/Maybe situation by simply allowing type families on one side of ~ but not the other.
2 In fact, I believe it's even stronger: they would be guaranteed not to be generative.

Code unexpectedly accepted by GHC/GHCi

I don't understand why this code should pass type-checking:
foo :: (Maybe a, Maybe b)
foo = let x = Nothing in (x,x)
Since each component is bound to the same variable x, I would expect that the most general type for this expression to be (Maybe a, Maybe a). I get the same results if I use a where instead of a let. Am I missing something?
Briefly put, the type of x gets generalized by let. This is a key step in the Hindley-Milner type inference algorithm.
Concretely, let x = Nothing initially assigns x the type Maybe t, where t is a fresh type variable. Then, the type gets generalized, universally quantifying all its type variables (technically: except those in use elsewhere, but here we only have t). This causes x :: forall t. Maybe t. Note that this is exactly the same type as Nothing :: forall t. Maybe t.
Hence, each time we use x in our code, that refers to a potentially different type Maybe t, much like Nothing. Using (x, x) gets the same type as (Nothing, Nothing) for this reason.
Instead, lambdas do not feature the same generalization step. By comparison (\x -> (x, x)) Nothing "only" has type forall t. (Maybe t, Maybe t), where both components are forced to be of the same type. Here x is again assigned type Maybe t, with t fresh, but it is not generalized. Then (x, x) is assigned type (Maybe t, Maybe t). Only at the top-level we generalize adding forall t, but at that point is too late to obtain a heterogeneous pair.

Abstract data types in a type class definition

I'm trying to understand what's happening with the type s below:
class A a where
f :: a -> s
data X = X
instance A X where
f x = "anything"
I expected this to work, thinking that since type s isn't bound to anything, it could be anything. But the compiler says that it "Couldn't match expected type ‘s’ with actual type ‘[Char]’", as if type s was a fixed type like Int, Char…
So my second interpretation was to say that, since we don't know anything about s in the type class declaration, we cannot tell when making X an instance of A if the return value of the function f we give matches type s or not. But there are type classes that use abstract data types that are not bound to anything without problems, like Functor:
class Functor f where
fmap :: (a -> b) -> f a -> f b
Why is the type s above a problem when types a and b here aren't?
You're trying to express this:
f :: a -> ∃s . s
...but what the signature you've written says is actually
f :: a -> ∀s . s
What does all of that mean?
The existential type ∃s . s means, the functions may return a value of some type, i.e. “there exists a type s such that the function returns an s value”.This is not supported by the Haskell language, because it turns out to be pretty useless.
The universal type ∀s . s means, the function is able to produce a value of any type, i.e. “for all types s, the function can return an s value”.
The latter is very useful; fmap is actually a good example: that function works, no matter what types a and b are, and the user is always guaranteed that the result will actually have the desired type, namely f b.
But that means you can't just pick some particular type in the implementation, like you did with String. ...Well, actually you can do that, but only by wrapping the existential in a data type:
{-# LANGUAGE ExistentialQuantification, UnicodeSyntax #-}
data Anything = ∀ s . Anything s
class A a where
f :: a -> Anything
instance A X where
f x = Anything "anything"
...but as I said, this is almost completely useless, because when somebody wants to use that instance they'll have no way to know what particular type the wrapped result value has. And there is nothing you can do with a value of completely unknown type.

Resources