I tried to come up with a way of creating a arbitrarily deep nested type where each instance depends on the type of it's parent, but I couldn't come up with any solution.
I was thinking about something like a type T with two type variables, where the second is itself a T which first type variable depends on its parent.
data T a b = T (a -> b) (T (b -> c) (T (c -> d) ...))
I know that in theory every type variable should appear on the left and right side, and I don't know if this is even possible at all, but I guess that there is some way of making this work.
My experience with GHC extensions like GADTs and RankNTypes is very limited but as far as I know this would be a usecase for those?
Related
In my project I ended up using the following unfold function in several places.
unfold :: (a -> Either b a) -> a -> b
unfold f x = either id (unfold f) $ f x
It feels like a very general recursion pattern (simply apply a function on values of type a until you get a Right b), but I'm failing to find such a function somewhere with Hoogle.
Is it really not defined anywhere?
This function is implemented in the extra package as Control.Monad.Extra.loop.
loop :: (a -> Either a b) -> a -> b
A looping operation, where the predicate returns Left as a seed for the next loop or Right to abort the loop.
loop (\x -> if x < 10 then Left $ x * 2 else Right $ show x) 1 == "16"
The extra package also provides a monadic version loopM.
As a future note, in addition to searching Hoogle for a function by name, you can simply plug in a type signature and Hoogle will look for functions matching this signature. I found the above function by entering (a -> Either a b) -> a -> b into Hoogle as such.
If you want a version with the type arguments flipped from the extra version, there's also iterateM_ from the monad-loops package.
iterateM_ :: Monad m => (a -> m a) -> a -> m b
This doesn't solve your problem in one step, but it's capturing the underlying pattern of recursion more accurately.
Note that there's nothing specific to Either going on in there, which does make it a bit trickier to find. I found it via hoogle by name. I didn't know if it existed, but I knew that if something by that name did exist, it would solve your problem. I derived that starting from iterate in base. It repeatedly applies a function to the output from the previous step, with an initial seed value, like you wanted. But you wanted to be able to short circuit at any time, which Either's monad instance does for you, so a name like iterateM would make more sense. Except you don't care about all the intermediate results, so iterateM_ would be the name that makes sense.
(Also, it turns out everything that does produce the intermediate results requires a streaming system to do it. Which makes sense in retrospect, but explains why hoogle changed what packages it was showing results from entirely when I added the "_" at the end.)
Now, as far as this not actually being exactly the type you want: yes, a bit was lost when it became more generic. But that can be recovered in a bit of an interesting way. Look back at the type of iterateM_, and notice that the return value is m b, and b wasn't mentioned anywhere previously in the type. This is subtle but significant information. Since there's no way a function can just make up a value of a polymorphic type it knows nothing about, it means that it can never produce such a value at all. Let's inline Either in the type and see what happens: iterateM_ :: (a -> Either r a) -> a -> Either r b. How can we safely convert Either r b to r? Well, we know we can choose anything at all for b, because it can't exist. And fortunately, there are tools to handle this.
base contains a module, Data.Void, that helps here. It has a type Void that has no constructors. You can think of it as a type-level signifier that something can't happen. And since it can't happen, there's an adapter to make things fit together on the value level as needed: absurd :: Void -> a. The name comes from logic, based on the idea that once the impossible happens once you're in a silly case and might as well allow anything else to happen. At an operational level, you can have values of type Void in Haskell because undefined :: Void will type-check. But absurd is perfectly safe regardless - it forces its argument to be evaluated. Since any time it actually is evaluated the argument must be a bottom value of some sort, this remains type-safe.
So how does this fit in with iterateM_? Well, things have worked out so there's an Either r b value that we know must be the Left constructor, but using fromLeft feels dirty. But what's interesting here is that while r is constrained by context, b is still completely polymorphic. We can choose whatever type we want for it, because we know it'll never happen. So choose Void, giving either id absurd :: Either r Void -> r
unfold :: (a -> Either b a) -> a -> b
unfold f x = either id absurd $ iterateM_ f x
Maybe that's not even an improvement on your original. Maybe it is - the recursion is captured in a combinator instead of explicit. And for what it's worth, that combinator captures a more general pattern of recursion than something specific to Either. But that has the cost of introducing an extra conversion step to realign the types afterwards, and that step ends up involving a new idea that wasn't necessary before. On the plus side, it makes a nice illustration for how to use the type system to communicate things clearly.
In Haskell I find it difficult to completely grasp the purpose of a kind system, and what it truly adds to the language.
I understand having kinds adds safety.
For example consider fmap :: (a -> b) -> f a -> f b vs a monokinded version of it fmap2 :: (a -> b) -> p -> q.
My understanding is that, thanks to the kind system, I can specify beforehand with higher detail what the shape of the data should be. It is better, as the type checker can check more, and stricter requirements will make it less likely that users will mess up with types. This will reduce the likelihood of programming errors. I believe this is the main motivation for kinds but I might be wrong.
Now I think fmap2 can have the same implementation as fmap, since its types are unconstrained. Is it true?
Could one just replace all multi kinded types in the base/ghc library by mono kinded ones, and would it still compile correctly?
To clarify a little bit I mean that for a class like:
class Functor f where
fmap :: (a -> b) -> f a -> f b
(<$) :: a -> f b -> f a
I might replace it by something like
class Functor a b p q where
fmap :: (a -> b) -> p -> q
(<$) :: a -> q -> p
This way for a Functor instance like
instance Functor [] where fmap = map
I might replace it by
instance Functor a b p q where fmap = map
Remark: This wont work as is because i also need to modify map and go down the dependency chain. Will think more about this later..
I'm trying to figure out if kinds add more than just safety? Can I do something with multi kinded types that I cannot do with mono kinded ones?
Remark: Here I forgot to mention i'm usually using some language extensions, typically to allow more flexibility in writing classes.
When doing vanilla haskell kinds can be really meaningful thing to use.
But when I start using type families, and a few of other extensions it becomes much less clear that I need kinds at all.
I have in mind that there is some monotraversable library which reimplements many standard library functions using single kinded types, and type families to generalize signatures. That's why my intuition is that multi kinded variables might not really add that much expressive power after all, provided one uses type families. However a drawback of doing this is that you lose type safety. My question is do you really only lose just that, or do you really lose something else.
Let's back up. It seems like you think that because the signature
id2 :: a -> b
is more general than
id :: a -> a
that id2 could have the same implementation as id. But that's not true. In fact id2 has no total implementation (any implementation of id2 involves an infinite loop or undefined).
The structure of generality creates a "pressure", and this pressure is what we must navigate to find the right type for something. If f is more general than g, that means that anywhere g could be used, f could also but it also means that any implementation of f is a valid implementation of g. So at the use site it goes one direction, at the definition site it goes the other.
We can use id2 anywhere we could have used id, but the signature of id2 is so general that it is impossible to implement. The more general a signature gets, the more contexts it can be used in, but also the less likely it is to have an implementation. I would say a main goal of a good type signature is to find the level of generality where there are as few as possible implementations of your intended function, without making it impossible to write altogether.
Your proposed signature fmap2 :: (a -> b) -> p -> q is, like id2, so general that it is impossible to implement. The purpose of higher-kinded types is to give us the tools to have very general signatures like fmap, which are not too general like fmap2.
I can separate functions from nullary values with a type family like this:
type family Funs (ts :: [*]) :: [*]
where
Funs '[ ] = '[ ]
Funs ((a -> b): ts) = (a -> b): Funs ts
Funs (k: ts) = Funs ts
What I would like is to separate types that satisfy a constraint, for instance Show. An attempt by analogy:
type family Showable (ts :: [*]) :: [*]
where
Showable '[ ] = '[ ]
Showable ((Show a => a): ts) = a: Showable ts
Showable (k: ts) = Showable ts
— Leads to an error:
• Illegal qualified type: Show a => a
• In the equations for closed type family ‘Showable’
In the type family declaration for ‘Showable’
|
35 | Showable ((Show a => a): ts) = a: Showable ts
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
What can be done? I am fine with a solution that uses Template Haskell, or lowly hackery of any sort.
I don't believe that it is possible to do this easily (without TH) because of the open-world assumption: GHC basically will never resolve the negative of a class constraint, because there might be more instances somewhere that make it true (and don't play nicely with the separate compilation strategy that GHC/Haskell uses). So, it is not generally possible to---from pure "regular" Haskell code---decide whether or not a type has a class instance, and so whether or not to include it in the list.
If you are willing to slightly break separate compilation, by only considering instances that are in scope when the module that you are working on is compiled (i.e. that are in scope in that module's source file), you can use Template Haskell or GHC typechecker plugins to get something very much like this behavior. I know of a couple of implementations doing something similar at the value level, including ifcxt and constraints-emerge. I believe that these libraries, especially ifcxt (which I am slightly more familiar with) are quite simple: you can use the TH function reify to get a ClassI Info for a particular typeclass, and use its [InstanceDec] field to get a list of all instances that are in scope during compilation. Then you can basically make one branch for each concrete type instance that adds the instance head to the list, and follow it up with one catch-all branch that will not. You may also need to do this recursively to deal with instances that have constraints themselves.
Notice that if you choose to use this approach, this will break the open-world assumption in potentially confusing ways: if a module imports the type-level filter module, and then defines a datatype/instance, the type-level filter will not be aware of the new instance, and will continue to treat the type as if it does not have an instance. You will need to make sure that all instances that you care about are in scope when you use TH to generate the filter type family.
If you want to improve this somewhat, you can use an approach even more like IfCxt where instead of creating the type family instances directly, you might be able to do something like this:
class IsShow (a :: Type) (b :: Bool) where
instance {-# OVERLAPPABLE #-} (b ~ 'False) => IsShow a b where
And you have your TH generate instances like:
instance IsShow Int 'True where
This has the advantage that if another module defines important types/instances, you should be able to use (roughly) the same TH to extend the instances of IsShow with these new instances, and your type families that use IsShow should be fine. The ifcxt package linked above does basically the same thing, but instead of doing the necessary trickery to get the information at the type level, it just generates functions to get it at the value level.
This solution uses a class with functional dependencies instead of a type family because OverlappingInstances makes it possible to give the class-based solution a "default case". I'm not sure whether there's any reasonable way to give an open type family a default case, so you might not be able to get this "extensibility" while using type families everywhere (instead of fundep'd instances).
Richard Eisenberg says
With separate compilation, the lack of ordering and the overlap check are necessary for type soundness.
So I think it may be impossible. There are also some interesting discussions around type families vs. fundeps here: https://typesandkinds.wordpress.com/2015/09/09/what-are-type-families/
This paper establishes that type inference (called "typability" in the paper) in System F is undecidable. What I've never heard mentioned elsewhere is the second result of the paper, namely that "type checking" in F is also undecidable. Here the "type checking" question means: given a term t, type T and typing environment A, is the judgment A ⊢ t : T derivable? That this question is undecidable (and that it's equivalent to the question of typability) is surprising to me, because it seems intuitively like it should be an easier question to answer.
But in any case, given that Haskell is based on System F (or F-omega, even), the result about type checking would seem to suggest that there is a Haskell term t and type T such that the compiler would be unable to decide whether t :: T is valid. If that's the case, I'm curious what such a term and type are... if it's not the case, what am I misunderstanding?
Presumably comprehending the paper would lead to a constructive answer, but I'm a little out of my depth :)
Type checking can be made decidable by enriching the syntax appropriately. For example, in the paper, we have lambdas written as \x -> e; to type-check this, you must guess the type of x. However, with a suitably enriched syntax, this can be written as \x :: t -> e instead, which takes the guess-work out of the process. Similarly, in the paper, they allow type-level lambdas to be implicit; that is, if e :: t, then also e :: forall a. t. To do typechecking, you have to guess when and how many forall's to add, and when to eliminate them. As before, you can make this more deterministic by adding syntax: we add two new expression forms /\a. e and e [t] and two new typing rule that says if e :: t, then /\a. e :: forall a. t, and if e :: forall a. t, then e [t'] :: t [t' / a] (where the brackets in t [t' / a] are substitution brackets). Then the syntax tells us when and how many foralls to add, and when to eliminate them as well.
So the question is: can we go from Haskell to sufficiently-annotated System F terms? And the answer is yes, thanks to a few critical restrictions placed by the Haskell type system. The most critical is that all types are rank one*. Without going into too much detail, "rank" is related to how many times you have to go to the left of an -> constructor to find a forall.
Int -> Bool -- rank 0?
forall a. (a -> a) -- rank 1
(forall a. a -> a) -> (forall a. a -> a) -- rank 2
In particular, this restricts polymorphism a bit. We can't type something like this with rank one types:
foo :: (forall a. a -> a) -> (String, Bool) -- a rank-2 type
foo polymorphicId = (polymorphicId "hey", polymorphicId True)
The next most critical restriction is that type variables can only be replaced by monomorphic types. (This includes other type variables, like a, but not polymorphic types like forall a. a.) This ensures in part that type substitution preserves rank-one-ness.
It turns out that if you make these two restrictions, then not only is type-inference decidable, but you also get minimal types.
If we turn from Haskell to GHC, then we can talk not only about what is typable, but how the inference algorithm looks. In particular, in GHC, there are extensions that relax the above two restrictions; how does GHC do inference in that setting? Well, the answer is that it simply doesn't even try. If you want to write terms using those features, then you must add the typing annotations we talked about all the way back in paragraph one: you must explicitly annotate where foralls get introduced and eliminated. So, can we write a term that GHC's type-checker rejects? Yes, it's easy: simply use un-annotated rank-two (or higher) types or impredicativity. For example, the following doesn't type-check, even though it has an explicit type annotation and is typable with rank-two types:
{-# LANGUAGE Rank2Types #-}
foo :: (String, Bool)
foo = (\f -> (f "hey", f True)) id
* Actually, restricting to rank two is enough to make it decidable, but the algorithm for rank one types can be more efficient. Rank three types already give the programmer enough rope to make the inference problem undecidable. I'm not sure whether these facts were known at the time that the committee chose to restrict Haskell to rank-one types.
Here is an example for a type level implementation of the SKI calculus in Scala: http://michid.wordpress.com/2010/01/29/scala-type-level-encoding-of-the-ski-calculus/
The last example shows an unbounded iteration. If you do the same in Haskell (and I'm pretty sure you can), you have an example for an "untypeable expression".
Since type variables cannot hold poly-types, it seems that with Rank*Types we cannot re-use existing functions because of their monotype restriction.
For example, we cannot use the function (.) when the intermediate type is a polytype. We are forced to re-implement (.) at the spot. This is of course trivial for (.) but a problem for more substantial bodies of code.
I also think making ((f . g) x) not equivalent to (f (g x)) a severe blow to referential transparency and its benefits.
It seems to me to be a show-stopper issue, and seems to make the Rank*Types extensions almost impractical for wide-spread use.
Am I missing something? Is there a plan to make Rank*Types interact better with the rest of the type-system?
EDIT: How can you make the types of (runST . forever) work out?
The most recent proposal for Rank-N types is Don's linked FPH paper. In my opinion it's also the nicest of the bunch. The main goal of all these systems is to require as few type annotations as possible. The problem is that when going from Hindley/Milner to System F we lose principal types and type inference becomes undecidable – hence the need for type annotations.
The basic idea of the "boxy types" work is to propagate type annotations as far as possible. The type checker switches between type checking and type inference mode and hopefully no more annotations are required. The downside here is that whether or not a type annotation is required is hard to explain because it depends on implementation details.
Remy's MLF system is so far the nicest proposal; it requires the least amount of type annotations and is stable under many code transformations. The problem is that it extends the type system. The following standard example illustrates this:
choose :: forall a. a -> a -> a
id :: forall b. b -> b
choose id :: forall c. (c -> c) -> (c -> c)
choose id :: (forall c. c -> c) -> (forall c. c -> c)
Both the above types are admissable in System F. The first one is the standard Hindley/Milner type and uses predicative instantiation, the second one uses impredicative instantiation. Neither type is more general than the other, so type inference would have to guess which type the user wants, and that is usually a bad idea.
MLF instead extends System F with bounded quantification. The principal (= most general) type for the above example would be:
choose id :: forall (a < forall b. b -> b). a -> a
You can read this as "choose id has type a to a where a must be an instance of forall b. b -> b".
Interestingly, this alone is no more powerful than standard Hindley/Milner. MLF therefore also allows rigid quantification. The following two types are equivalent:
(forall b. b -> b) -> (forall b. b -> b)
forall (a = forall b. b -> b). a -> a
Rigid quantification is introduced by type annotations and the technical details are indeed quite complicated. The upside is that MLF only needs very few type annotations and there is a simple rule for when they are needed. The downsides are:
Types can become harder to read, because the right hand side of '<' can contain further nested quantifications.
Until recently no explicitly typed variant of MLF existed. This is important for typed compiler transformations (like GHC does). Part 3 of Boris Yakobowski's PhD thesis has a first attempt at such a variant. (Parts 1 & 2 are also interesting; they describe a more intuitive representation of MLF via "Graphical Types".)
Coming back to FPH, its basic idea is to use MLF techniques internally, but to require type annotations on let bindings. If you only want the Hindley/Milner type, then no annotations are necessary. If you want a higher-rank type, you need to specify the requested type, but only at the let (or top-level) binding.
FPH (like MLF) supports impredicative instantiation, so I don't think your issue applies. It should therefore have no issue typing your f . g expression above. However, FPH hasn't been implemented in GHC yet and most likely won't be. The difficulties come from the interaction with equality coercions (and possibly type class constraints). I'm not sure what the latest status is, but I heard that SPJ wants to move away from impredicativity. All that expressive power comes at a cost, and so far no affordable and all-accompanying solution has been found.
Is there a plan to make Rank*Types interact better with the rest of the type-system?
Given how common the ST monad is, at least Rank2 types are common enough to be evidence to the contrary. However, you might look at the "sexy/boxy types" series of papers, for how approaches to making arbitrary rank polymorphism play better with others.
FPH : First-class Polymorphism for Haskell, Dimitrios Vytiniotis, Stephanie Weirich, and Simon Peyton Jones, submitted to ICFP 2008.
See also -XImpredicativeTypes -- which interestingly, is slated for deprecation!
About ImpredicativeTypes: that doesn't actually make a difference (I'm relatively sure) to peaker's question. That extension has to do with datatypes. For instance, GHC will tell you that:
Maybe :: * -> *
(forall a. a -> a) :: *
However, this is sort of a lie. It's true in an impredicative system, and in such a system, you can write:
Maybe (forall a. a -> a) :: *
and it will work fine. That is what ImpredicativeTypes enables. Without the extension, the appropriate way to think about this is:
Maybe :: *m -> *m
(forall a :: *m. a -> a) :: *p
and thus there is a kind mismatch when you try to form the application above.
GHC is fairly inconsistent on the impredicativity front, though. For instance, the type for id I gave above would be:
id :: (forall a :: *m. a -> a)
but GHC will gladly accept the annotation (with RankNTypes enabled, but not ImpredicativeTypes):
id :: (forall a. a -> a) -> (forall a. a -> a)
even though forall a. a -> a is not a monotype. So, it will allow impredicative instantiation of quantified variables that are used only with (->) if you annotate as such. But it won't do it itself, I guess, which leads to the runST $ ... problems. That used to be solved with an ad-hoc instantiation rule (the details of which I was never particularly clear on), but that rule was removed not long after it was added.