What is wrong with pattern matching of existential types in Haskell? [duplicate] - haskell

When I try to pattern-match a GADT in an proc syntax (with Netwire and Vinyl):
sceneRoot = proc inputs -> do
let (Identity camera :& Identity children) = inputs
returnA -< (<*>) (map (rGet draw) children) . pure
I get the (rather odd) compiler error, from ghc-7.6.3
My brain just exploded
I can't handle pattern bindings for existential or GADT data constructors.
Instead, use a case-expression, or do-notation, to unpack the constructor.
In the pattern: Identity cam :& Identity childs
I get a similar error when I put the pattern in the proc (...) pattern. Why is this? Is it unsound, or just unimplemented?

Consider the GADT
data S a where
S :: Show a => S a
and the execution of the code
foo :: S a -> a -> String
foo s x = case s of
S -> show x
In a dictionary-based Haskell implementation, one would expect that the value s is carrying a class dictionary, and that the case extracts the show function from said dictionary so that show x can be performed.
If we execute
foo undefined (\x::Int -> 4::Int)
we get an exception. Operationally, this is expected, because we can not access the dictionary.
More in general, case (undefined :: T) of K -> ... is going to produce an error because it forces the evaluation of undefined (provided that T is not a newtype).
Consider now the code (let's pretend that this compiles)
bar :: S a -> a -> String
bar s x = let S = s in show x
and the execution of
bar undefined (\x::Int -> 4::Int)
What should this do? One might argue that it should generate the same exception as with foo. If this were the case, referential transparency would imply that
let S = undefined :: S (Int->Int) in show (\x::Int -> 4::Int)
fails as well with the same exception. This would mean that the let is evaluating the undefined expression, very unlike e.g.
let [] = undefined :: [Int] in 5
which evaluates to 5.
Indeed, the patterns in a let are lazy: they do not force the evaluation of the expression, unlike case. This is why e.g.
let (x,y) = undefined :: (Int,Char) in 5
successfully evaluates to 5.
One might want to make let S = e in e' evaluate e if a show is needed in e', but it feels rather weird. Also, when evaluating let S = e1 ; S = e2 in show ... it would be unclear whether to evaluate e1, e2, or both.
GHC at the moment chooses to forbid all these cases with a simple rule: no lazy patterns when eliminating a GADT.

Related

Why are values of type () inspected?

In Haskell, the () type has two values, namely, () and bottom. If you have an expression e :: (), there's no point in actually inspecting it, since either it's e = () or by inspecting it you're crashing a program which could otherwise have not crashed.
Hence, I figured that perhaps operations on values of type () would not inspect the value and would not distinguish between () and bottom.
However, this is wildly untrue:
▎λ ghci
GHCi, version 9.0.2: https://www.haskell.org/ghc/ :? for help
ghci> u = (undefined :: ())
ghci> show u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> () == u
*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
ghci> f () = "ok"
ghci> f u
"*** Exception: Prelude.undefined
CallStack (from HasCallStack):
error, called at libraries/base/GHC/Err.hs:75:14 in base:GHC.Err
undefined, called at <interactive>:1:6 in interactive:Ghci1
What is the reason for this? Here are some conjectures:
For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.
Haskell semantics are written in such a way that destructuring any ADTs, even trivial ones, inspects them. This means that having case (undefined :: ()) of { () -> ... } not throw would be a violation of language semantics
() is an extremely special case and simply isn't worth the attention to eke out this tiny extra bit of safety in a massive language like Haskell
There's also the possible combination explanation of 2+3, that Haskell could have had semantics dictating that an expression case e of inspects e unless it is of type (), but that would pollute the language spec for relatively low benefit
I will address this part:
For some reason that I can't think of, it's useful to be non-lazy on (). Sometimes we want that bottom to propagate.
Let's have a look at Control.Parallel.Strategies (version 1, an older version). This is a module for parallel evaluation. Let's focus on one of its functions for the sake of simplicity:
parMap :: Strategy b -> (a -> b) -> [a] -> [b]
The result of parMap strat f xs is the same as map f xs, except that the list is computed in parallel. What is the strat argument? Well,
strat :: Strategy b
means
strat :: b -> ()
There are only two things you can do with strat:
call it and ignore the result, which by laziness amounts to not calling it at all;
call it and force the result, even if you know it's () or a bottom.
parMap does the latter, in parallel. This allows the caller to specify a strat argument that evaluates the list values of type b as needed. For example
parMap (\(x,y) -> ()) f xs
parMap (\(x,y) -> x `seq` ()) f xs
parMap (\(x,y) -> x `seq` y `seq` ()) f xs
are valid calls, and will cause parMap to evaluate the new list-of-pairs only to expose the pair constructor, also the first component, also the second component, respectively.
Hence, forcing the () result of strat in this case allows the user to control how much evaluation to perform during parMap, i.e. how much to force the result (in parallel), and consequently which parts of the result should be left unevaluated. (By comparison map f xs would leave the result fully unevaluated -- it is completely lazy. parMap can not do that otherwise it is not longer parallel.)
Minor digression: note that the GADT
data a :~: b where
Refl :: t :~: t
has one constructor like (). Here, it is mandatory that such values are forced as in:
foo :: Int :~: String -> Int -> String
foo Refl x = x ++ " hello"
Here the first argument must be a bottom. By forcing that, we make the function error out with an exception. If we did not force that, we would get a very nasty undefined behavior like those in C and C++, completely breaking type safety. Haskell will correctly reject any attempt to circumvent that:
foo :: Int :~: String -> Int -> String
foo _ x = x ++ " hello"
triggers a type error at compile time.
I don't know for sure, but I suspect it's none of the things you said. Instead, this is so that the language is predictable and consistent.
There are, essentially, two things you observed, and I consider them to be separate things. The first is that checking whether a x is indeed () with a case statement forces evaluation of x; the second is that the instances (of Show and Eq) are written to use a case statement.
Pattern matching: the predictable, consistent rule here is that if you write case <e0> of <pat> -> <e1>, then e0 is evaluated far enough to check whether the constructors in pat are in fact in the given places. Well, okay, there's some wrinkles here to do with irrefutable patterns; let's say instead that e0 is evaluated far enough to check whether pat actually does match! For the () type, that means that the pattern () causes full evaluation -- because you've specified the full value that you expect it to be -- while the pattern x or _ can match without further evaluation.
Class instances: the natural inductive way to specify what the various class instances do is to always have an outermost case that matches against each available constructor with simple variable patterns for the fields, then does something (presumably recursive calls) on each of the fields in turn. That is, simplifying a bit, the show implementation goes like:
show x = case x of
<Con0> field00 field01 field02 <...> -> "<Con0>"
++ " " ++ show field00
++ " " ++ show field01
++ " " ++ show field02
++ <...>
<Con1> field10 field11 field12 <...> -> "<Con1>"
++ " " ++ show field10
++ " " ++ show field11
++ " " ++ show field12
++ <...>
<...>
It is very natural for the specialization of this scheme to the single-constructor, zero-field type () to go:
show x = case x of
() -> "()"
(Additionally, the Report specifies that (==) is is always strict in both arguments; but that property would also arise naturally from the obvious way of writing a generic Eq instance derivation algorithm.) Therefore the path of least surprise is for class instances to pattern match on their argument(s).
#2 is definitely true.
The () type is just a nullary data type with special type/data constructor syntax:
data () = ()
As a result, the Haskell 2010 report, while only providing an informal semantics, makes it pretty clear in section 3.17.2 Informal Semantics of Pattern Matching that the expression:
case undefined of () -> "ack!"
will be evaluated as per rule #5:
Matching the pattern con pat1 … patn against a value, where con is a constructor defined by data, depends on the value:
If the value is of the form con v1 … vn, sub-patterns are matched left-to-right against the components of the data value; if all matches succeed, the overall match succeeds; the first to fail or diverge causes the overall match to fail or diverge, respectively.
If the value is of the form con′ v1 … vm, where con is a different constructor to con′, the match fails.
If the value is ⊥, the match diverges.
Here, the value of undefined is ⊥, so the third bullet point applies, and the match diverges. And if the match diverges, the program diverges, and if the program diverges it must terminate with an error or -- at worst -- loop forever. It cannot continue as if nothing has happened. Admittedly, this last part is not explicitly stated, but it is the only reasonable interpretation for the semantics of a divergent evaluation of an expression.

Type inference in where clauses

If I write
foo :: [Int]
foo = iterate (\x -> _) 0
GHC happily tells me that x is an Int and that the hole should be another Int. However, if I rewrite it to
foo' :: [Int]
foo' = iterate next 0
where next x = _
it has no idea what the type of x, nor the hole, is. The same happens if I use let.
Is there any way to recover type inference in where bindings, other than manually adding type signatures?
Not really. This behavior is by design and is inherited from the theoretical Hindley-Milner type system that formed the initial inspiration for Haskell's type system. (The behavior is known as "let-polymoprhism" and is arguably the most critical feature of the H-M system.)
Roughly speaking, lambdas are typed "top-down": the expression (\x -> _) is first assigned the type Int -> Int when type-checking the containing expression (specifically, when type-checking iterate's arguments), and this type is then used to infer the type of x :: Int and of the hole _ :: Int.
In contrast, let and where-bound variables are typed "bottom-up". The type of next x = _ is inferred first, independently of its use in the main expression, and once that type has been determined, it's checked against its use in the expression iterate next 0. In this case, the expression next x = _ is inferred to have the rather useless type p -> t. Then, that type is checked against its use in the expression iterate next 0 which specializes it to Int -> Int (by taking p ~ Int and t ~ Int) and successfully type-checks.
In languages/type-systems without this distinction (and ignoring recursive bindings), a where clause is just syntactic sugar for a lambda binding and application:
foo = expr1 where baz = bazdefn ==> foo = (\baz -> expr1) bazdefn
so one thing you could do is "desugar" the where clause to the "equivalent" lambda binding:
foo' :: [Int]
foo' = (\next -> iterate next 0) (\x -> _)
This syntax is physically repulsive, sure, but it works. Because of the top-down typing of lambdas, both x and the hole are typed as Int.

How to define a function type of a non-recursive functions that has still a recursive type

Given is a Javascript function like
const isNull = x => x === null ? [] : [isNull];
Such a function might be nonsense, which is not the question, though.
When I tried to express a Haskell-like type annotation, I failed. Likewise with attempting to implement a similar function in Haskell:
let isZero = \n -> if n == 0 then [] else [isZero] -- doesn't compile
Is there a term for this kind of functions that aren't recursive themselves, but recursive in their type? Can such functions be expressed only in dynamically typed languages?
Sorry if this is obvious - my Haskell knowledge (including strict type systems) is rather superficial.
You need to define an explicit recursive type for that.
newtype T = T (Int -> [T])
isZero :: T
isZero = T (\n -> if n == 0 then [] else [isZero])
The price to pay is the wrapping/unwrapping of the T constructor, but it is feasible.
If you want to emulate a Javascript-like untyped world (AKA unityped, or dynamically typed), you can even use
data Value
= VInt Int
| VList [Value]
| VFun (Value -> Value)
...
(beware of a known bug)
In principle, every Javascript value can be represented by the above huge sum type. For example, application becomes something like
jsApply (VFun f) v = f v
jsApply _ _ = error "Can not apply a non-function value"
Note how static type checks are turned into dynamic checks, in this way. Similarly, static type errors are turned into runtime errors.
Chi showed how such an infinite type can be implemented: you need a newtype wrapper to “hide” the infinite recursion.
An intriguing alternative is to use a fixpoint formulation. Recall that you could pseudo-define something recursive like your example as
isZero = fix $ \f n -> if n == 0 then [] else [f]
Likewise, the type can actually be expressed as a fixpoint of the relevant functors, namely of the composition of (Int ->) and [] (which in transformer gestalt is ListT):
isZero :: Fix (ListT ((->) Int))
isZero = Fix . ListT $ \n -> if n==0 then [] else [isZero]
Also worth noting is that you probably don't really want ListT there. MaybeT would be more natural if you only ever have zero or one elements. Even more nicely though, you can use the fact that functor fixpoints are closely related to the free monad, which gives you exactly that “possibly trivial” alternative case:
isZero' :: Free ((->) Int) ()
isZero' = wrap $ \n -> if n==0 then Pure () else isZero'
Pure () is just return () in the monad instance, so you can as well replace the if construct with the standard when:
isZero' = wrap $ \n -> when (n/=0) isZero'

When are type signatures necessary in Haskell?

Many introductory texts will tell you that in Haskell type signatures are "almost always" optional. Can anybody quantify the "almost" part?
As far as I can tell, the only time you need an explicit signature is to disambiguate type classes. (The canonical example being read . show.) Are there other cases I haven't thought of, or is this it?
(I'm aware that if you go beyond Haskell 2010 there are plenty for exceptions. For example, GHC will never infer rank-N types. But rank-N types are a language extension, not part of the official standard [yet].)
Polymorphic recursion needs type annotations, in general.
f :: (a -> a) -> (a -> b) -> Int -> a -> b
f f1 g n x =
if n == (0 :: Int)
then g x
else f f1 (\z h -> g (h z)) (n-1) x f1
(Credit: Patrick Cousot)
Note how the recursive call looks badly typed (!): it calls itself with five arguments, despite f having only four! Then remember that b can be instantiated with c -> d, which causes an extra argument to appear.
The above contrived example computes
f f1 g n x = g (f1 (f1 (f1 ... (f1 x))))
where f1 is applied n times. Of course, there is a much simpler way to write an equivalent program.
Monomorphism restriction
If you have MonomorphismRestriction enabled, then sometimes you will need to add a type signature to get the most general type:
{-# LANGUAGE MonomorphismRestriction #-}
-- myPrint :: Show a => a -> IO ()
myPrint = print
main = do
myPrint ()
myPrint "hello"
This will fail because myPrint is monomorphic. You would need to uncomment the type signature to make it work, or disable MonomorphismRestriction.
Phantom constraints
When you put a polymorphic value with a constraint into a tuple, the tuple itself becomes polymorphic and has the same constraint:
myValue :: Read a => a
myValue = read "0"
myTuple :: Read a => (a, String)
myTuple = (myValue, "hello")
We know that the constraint affects the first part of the tuple but does not affect the second part. The type system doesn't know that, unfortunately, and will complain if you try to do this:
myString = snd myTuple
Even though intuitively one would expect myString to be just a String, the type checker needs to specialize the type variable a and figure out whether the constraint is actually satisfied. In order to make this expression work, one would need to annotate the type of either snd or myTuple:
myString = snd (myTuple :: ((), String))
In Haskell, as I'm sure you know, types are inferred. In other words, the compiler works out what type you want.
However, in Haskell, there are also polymorphic typeclasses, with functions that act in different ways depending on the return type. Here's an example of the Monad class, though I haven't defined everything:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
fail :: String -> m a
We're given a lot of functions with just type signatures. Our job is to make instance declarations for different types that can be treated as Monads, like Maybe t or [t].
Have a look at this code - it won't work in the way we might expect:
return 7
That's a function from the Monad class, but because there's more than one Monad, we have to specify what return value/type we want, or it automatically becomes an IO Monad. So:
return 7 :: Maybe Int
-- Will return...
Just 7
return 6 :: [Int]
-- Will return...
[6]
This is because [t] and Maybe have both been defined in the Monad type class.
Here's another example, this time from the random typeclass. This code throws an error:
random (mkStdGen 100)
Because random returns something in the Random class, we'll have to define what type we want to return, with a StdGen object tupelo with whatever value we want:
random (mkStdGen 100) :: (Int, StdGen)
-- Returns...
(-3650871090684229393,693699796 2103410263)
random (mkStdGen 100) :: (Bool, StdGen)
-- Returns...
(True,4041414 40692)
This can all be found at learn you a Haskell online, though you'll have to do some long reading. This, I'm pretty much 100% certain, it the only time when types are necessary.

How to work around F#'s type system

In Haskell, you can use unsafeCoerce to override the type system. How to do the same in F#?
For example, to implement the Y-combinator.
I'd like to offer a different solution, based on embedding the untyped lambda calculus in a typed functional language. The idea is to create a data type that allows us to change between types α and α → α, which subsequently allows to escape the restrictions of a type system. I'm not very familiar with F# so I'll give my answer in Haskell, but I believe it could be adapted easily (perhaps the only complication could be F#'s strictness).
-- | Roughly represents morphism between #a# and #a -> a#.
-- Therefore we can embed a arbitrary closed λ-term into #Any a#. Any time we
-- need to create a λ-abstraction, we just nest into one #Any# constructor.
--
-- The type parameter allows us to embed ordinary values into the type and
-- retrieve results of computations.
data Any a = Any (Any a -> a)
Note that the type parameter isn't significant for combining terms. It just allows us to embed values into our representation and extract them later. All terms of a particular type Any a can be combined freely without restrictions.
-- | Embed a value into a λ-term. If viewed as a function, it ignores its
-- input and produces the value.
embed :: a -> Any a
embed = Any . const
-- | Extract a value from a λ-term, assuming it's a valid value (otherwise it'd
-- loop forever).
extract :: Any a -> a
extract x#(Any x') = x' x
With this data type we can use it to represent arbitrary untyped lambda terms. If we want to interpret a value of Any a as a function, we just unwrap its constructor.
First let's define function application:
-- | Applies a term to another term.
($$) :: Any a -> Any a -> Any a
(Any x) $$ y = embed $ x y
And λ abstraction:
-- | Represents a lambda abstraction
l :: (Any a -> Any a) -> Any a
l x = Any $ extract . x
Now we have everything we need for creating complex λ terms. Our definitions mimic the classical λ-term syntax, all we do is using l to construct λ abstractions.
Let's define the Y combinator:
-- λf.(λx.f(xx))(λx.f(xx))
y :: Any a
y = l (\f -> let t = l (\x -> f $$ (x $$ x))
in t $$ t)
And we can use it to implement Haskell's classical fix. First we'll need to be able to embed a function of a -> a into Any a:
embed2 :: (a -> a) -> Any a
embed2 f = Any (f . extract)
Now it's straightforward to define
fix :: (a -> a) -> a
fix f = extract (y $$ embed2 f)
and subsequently a recursively defined function:
fact :: Int -> Int
fact = fix f
where
f _ 0 = 1
f r n = n * r (n - 1)
Note that in the above text there is no recursive function. The only recursion is in the Any data type, which allows us to define y (which is also defined non-recursively).
In Haskell, unsafeCoerce has the type a -> b and is generally used to assert to the compiler that the thing being coerced actually has the destination type and it's just that the type-checker doesn't know it.
Another, less common use, is to reinterpret a pattern of bits as another type. For example an unboxed Double# could be reinterpreted as an unboxed Int64#. You have to be sure about the underlying representations for this to be safe.
In F#, the first application can be achieved with box |> unbox as John Palmer said in a comment on the question. If possible use explicit type arguments to make sure that you don't accidentally have the wrong coercion inferred, e.g. box<'a> |> unbox<'b> where 'a and 'b are type variables or concrete types that are already in scope in your code.
For the second application, look at the BitConverter class for specific conversions of bit-patterns. In theory you could also do something like interfacing with unmanaged code to achieve this, but that seems very heavyweight.
These techniques won't work for implementing the Y combinator because the cast is only valid if the runtime objects actually do have the target type, but with the Y combinator you actually need to call the same function again but with a different type. For this you need the kinds of encoding tricks mentioned in the question John Palmer linked to.

Resources