Code that exercises type inference - programming-languages

I'm working on an experimental programming language that has global polymorphic type inference.
I recently got the algorithm working sufficiently well to correctly type the bits of sample code I'm throwing at it. I'm now looking for something more complex that will exercise the edge cases.
Can anyone point me at a source of really gnarly and horrible code fragments that I can use for this? I'm sure the functional programming world has plenty. I'm particularly looking for examples that do evil things with function recursion, as I need to check to make sure that function expansion terminates correctly, but anything's good --- I need to build a test suite. Any suggestions?
My language is largely imperative, but any ML-style code ought to be easy to convert.

My general strategy is actually to approach it from the opposite direction -- ensure that it rejects incorrect things!
That said, here are some standard "confirmation" tests I usually use:
The eager fix point combinator (unashamedly stolen from here):
datatype 'a t = T of 'a t -> 'a
val y = fn f => (fn (T x) => (f (fn a => x (T x) a)))
(T (fn (T x) => (f (fn a => x (T x) a))))
Obvious mutual recursion:
fun f x = g (f x)
and g x = f (g x)
Check out those deeply nested let expressions too:
val a = let
val b = let
val c = let
val d = let
val e = let
val f = let
val g = let
val h = fn x => x + 1
in h end
in g end
in f end
in e end
in d end
in c end
in b end
Deeply nested higher order functions!
fun f g h i j k l m n =
fn x => fn y => fn z => x o g o h o i o j o k o l o m o n o x o y o z
I don't know if you have to have the value restriction in order to incorporate mutable references. If so, see what happens:
fun map' f [] = []
| map' f (h::t) = f h :: map' f t
fun rev' [] = []
| rev' (h::t) = rev' t # [h]
val x = map' rev'
You might need to implement map and rev in the standard way :)
Then with actual references lying around (stolen from here):
val stack =
let val stk = ref [] in
{push = fn x => stk := x :: !stk,
pop = fn () => stk := tl (!stk),
top = fn () => hd (!stk)}
end
Hope these help in some way. Make sure to try to build a set of regression tests you can re-run in some automatic fashion to ensure that all of your type inference behaves correctly through all changes you make :)

Related

Is this an accurate example of a Haskell Pullback?

I'm still trying to grasp an intuition of pullbacks (from category theory), limits, and universal properties, and I'm not quite catching their usefulness, so maybe you could help shed some insight on that as well as verifying my trivial example?
The following is intentionally verbose, the pullback should be (p, p1, p2), and (q, q1, q2) is one example of a non-universal object to "test" the pullback against to see if things commute properly.
-- MY DIAGRAM, A -> B <- C
type A = Int
type C = Bool
type B = (A, C)
f :: A -> B
f x = (x, True)
g :: C -> B
g x = (1, x)
-- PULLBACK, (p, p1, p2)
type PL = Int
type PR = Bool
type P = (PL, PR)
p = (1, True) :: P
p1 = fst
p2 = snd
-- (g . p2) p == (f . p1) p
-- TEST CASE
type QL = Int
type QR = Bool
type Q = (QL, QR)
q = (152, False) :: Q
q1 :: Q -> A
q1 = ((+) 1) . fst
q2 :: Q -> C
q2 = ((||) True) . snd
u :: Q -> P
u (_, _) = (1, True)
-- (p2 . u == q2) && (p1 . u = q1)
I was just trying to come up with an example that fit the definition, but it doesn't seem particularly useful. When would I "look for" a pull back, or use one?
I'm not sure Haskell functions are the best context
in which to talk about pull-backs.
The pull-back of A -> B and C -> B can be identified with a subset of A x C,
and subset relationships are not directly expressible in Haskell's
type system. In your specific example the pull-back would be
the single element (1, True) because x = 1 and b = True are
the only values for which f(x) = g(b).
Some good "practical" examples of pull-backs may be found
starting on page 41 of Category Theory for Scientists
by David I. Spivak.
Relational joins are the archetypal example of pull-backs
which occur in computer science. The query:
SELECT ...
FROM A, B
WHERE A.x = B.y
selects pairs of rows (a,b) where a is a row from table A
and b is a row from table B and where some function of a
equals some other function of b. In this case the functions
being pulled back are f(a) = a.x and g(b) = b.y.
Another interesting example of a pullback is type unification in type inference. You get type constraints from several places where a variable is used, and you want to find the tightest unifying constraint. I mention this example in my blog.

Is it possible to generalise equations in Haskell?

Apologies for my poor wording of the question. I've tried searching for an answer but not knowing what to search is making it very difficult to find one.
Here is a simple function which calculates the area of a triangle.
triangleArea :: Float -> Float -> Float -> Float
triangleArea a b c
| (a + b) <= c = error "Not a triangle!"
| (a + c) <= b = error "Not a triangle!"
| (b + c) <= a = error "Not a triangle!"
| otherwise = sqrt (s * (s - a) * (s - b) * (s - c))
where s = (a + b + c) / 2
Three lines of the function have been taken up for the purposes of error checking. I was wondering if these three lines could be condensed into one generic line.
I was wondering if something similar to the following would be possible
(arg1 + arg2) == arg3
where Haskell knows to check each possible combination of the three arguments.
I think #behzad.nouri's comment is the best. Sometimes doing a little math is the best way to program. Here's a somewhat overdone expansion on #melpomene's solution, which I thought would be fun to share. Let's write a function similar to permutations but that computes combinations:
import Control.Arrow (first, second)
-- choose n xs returns a list of tuples, the first component of each having
-- n elements and the second component having the rest, in all combinations
-- (ignoring order within the lists). N.B. this would be faster if implemented
-- using a DList.
choose :: Int -> [a] -> [([a],[a])]
choose 0 xs = [([], xs)]
choose _ [] = []
choose n (x:xs) =
map (first (x:)) (choose (n-1) xs) ++
map (second (x:)) (choose n xs)
So..
ghci> choose 2 [1,2,3]
[([1,2],[3]),([1,3],[2]),([2,3],[1])]
Now you can write
triangleArea a b c
| or [ x + y <= z | ([x,y], [z]) <- choose 2 [a,b,c] ] = error ...
This doesn't address the question of how to shorten your error checking code, but you may be able to limit how often you repeat it by defining some new types with invariants. This function needs error checking because you can't trust the user to supply Float triples that make a reasonable triangle, and if you continue to define functions this way then every triangle-related function you write would need similar error checks.
However, if you define a Triangle type, you can check your invariants only once, when a triangle is created, and then all other functions will be guaranteed to receive valid triangles:
module Triangle (Triangle(), mkTriangle, area) where
data Triangle a = Triangle a a a deriving Show
mkTriangle :: (Num a, Ord a) => a -> a -> a -> Either String (Triangle a)
mkTriangle a b c
| a + b <= c = wrong
| a + c <= b = wrong
| b + c <= a = wrong
| otherwise = Right $ Triangle a b c
where wrong = Left "Not a triangle!"
area :: Floating a => Triangle a -> a
area (Triangle a b c) = sqrt (s * (s - a) * (s - b) * (s - c))
where s = (a + b + c) / 2
Here we export the Triangle type, but not its constructor, so that the client must use mkTriangle instead, which can do the required error checking. Then area, and any other triangle functions you write, can omit the checks that they are receiving a valid triangle. This general pattern is called "smart constructors".
Here are two ideas.
Using existing tools, you can generate all the permutations of the arguments and check that they all satisfy a condition. Thus:
import Data.List
triangleArea a b c
| any (\[x, y, z] -> x + y <= z) (permutations [a,b,c])
= error "Not a triangle!"
| otherwise = {- ... -}
This doesn't require writing very much additional code; however, it will search some permutations you don't care about.
Use the usual trick for choosing an element from a list and the left-overs. The zippers function is one I use frequently:
zippers :: [a] -> [([a], a, [a])]
zippers = go [] where
go b [] = []
go b (v:e) = (b, v, e) : go (v:b) e
We can use it to build a function which chooses only appropriate triples of elements:
triples :: [a] -> [(a, a, a)]
triples xs = do
(b1, v1, e1) <- zippers xs
(b2, v2, e2) <- zippers e1
v3 <- b1 ++ b2 ++ e2
return (v1, v2, v3)
Now we can write our guard like in part (1), but it will only consider unique pairings for the addition.
triangleArea a b c
| any (\(x, y, z) -> x + y <= z) (triples [a,b,c])
= error "Not a triangle!"
| otherwise = {- ... -}

Is there a lazy functional (immutable) language where functions have intermediate variables+return?

I apologize if this has an obvious answer. I would like to find a lazy functional programming language where the following pseudo code makes sense:
let f = function(x) {
let y = x*x // The variables y and z
let z = y*2 // are local
return z
}
This is, of course, doable in a functional, declarative language like haskell. Either with let bindings or where keyword. These are just chained together in one expression.
--local variables with 'where'
f :: Int -> Int
f x = z where
z = y*2
y = x*x
--local variables with let
g :: Int -> Int
g x =
let y = x*x
z = y*2
in z
https://wiki.haskell.org/Let_vs._Where

Remove elements during infinite sequence generation

I found a great haskell solution (source) for generating a Hofstadter sequence:
hofstadter = unfoldr (\(r:s:ss) -> Just (r, r+s:delete (r+s) ss)) [1..]
Now, I am trying to write such a solution in F#, too. Unfortunately (I am not really familar to F#) I had no success so far.
My problem is, that when I use a sequence in F#, it seems not to be possible to remove an element (like it is done in the haskell solution).
Other data structures like arrays, list or set which allow to remove elements are not generating an infinite sequence, but operate on certain elements, only.
So my question: Is it possible in F# to generate an infinite sequence, where elements are deleted?
Some stuff I tried so far:
Infinite sequence of numbers:
let infinite =
Seq.unfold( fun state -> Some( state, state + 1) ) 1
Hofstadter sequence - not working, because there is no del keyword and there are more syntax errors
let hofstadter =
Seq.unfold( fun (r :: s :: ss) -> Some( r, r+s, del (r+s) ss)) infinite
I thought about using Seq.filter, but found no solution, either.
I think you need more than a delete function on sequence. Your example requires pattern matching on inifinite collections, which sequence doesn't support.
The F# counterpart of Haskell list is LazyList from F# PowerPack. LazyList is also potentially infinite and it supports pattern matching, which helps you to implement delete easily.
Here is a faithful translation:
open Microsoft.FSharp.Collections.LazyList
let delete x xs =
let rec loop x xs = seq {
match xs with
| Nil -> yield! xs
| Cons(x', xs') when x = x' -> yield! xs'
| Cons(x', xs') ->
yield x'
yield! loop x xs'
}
ofSeq (loop x xs)
let hofstadter =
1I |> unfold (fun state -> Some(state, state + 1I))
|> unfold (function | (Cons(r, Cons(s, ss))) ->
Some(r, cons (r+s) (delete (r+s) ss))
| _ -> None)
|> toSeq
There are a few interesting things here:
Use sequence expression to implement delete to ensure that the function is tail-recursive. A non-tail-recursive version should be easy.
Use BigInteger; if you don't need too many elements, using int and Seq.initInfinite is more efficient.
Add a case returning None to ensure exhaustive pattern matching.
At last I convert LazyList to sequence. It gives better interoperability with .NET collections.
Implementing delete on sequence is uglier. If you are curious, take a look at Remove a single non-unique value from a sequence in F# for reference.
pad's solution is nice but, likely due to the way LazyList is implemented, stack overflows somewhere between 3-4K numbers. For curiosity's sake I wrote a version built around a generator function (unit -> 'a) which is called repeatedly to get the next element (to work around the unwieldiness of IEnumerable). I was able to get the first 10K numbers (haven't tried beyond that).
let hofstadter() =
let delete x f =
let found = ref false
let rec loop() =
let y = f()
if not !found && x = y
then found := true; loop()
else y
loop
let cons x f =
let first = ref true
fun () ->
if !first
then first := false; x
else f()
let next =
let i = ref 0
fun () -> incr i; !i
Seq.unfold (fun next ->
let r = next()
let s = next()
Some(r, (cons (r+s) (delete (r+s) next)))) next
In fact, you can use filter and a design that follows the haskell solution (but, as #pad says, you don't have pattern matching on sequences; so I used lisp-style destruction):
let infinite = Seq.initInfinite (fun i -> i+1)
let generator = fun ss -> let (r, st) = (Seq.head ss, Seq.skip 1 ss)
let (s, stt) = (Seq.head st, Seq.skip 1 st)
let srps = seq [ r + s ]
let filtered = Seq.filter (fun t -> (r + s) <> t) stt
Some (r, Seq.append srps filtered)
let hofstadter = Seq.unfold generator infinite
let t10 = Seq.take 10 hofstadter |> Seq.toList
// val t10 : int list = [1; 3; 7; 12; 18; 26; 35; 45; 56; 69]
I make no claims about efficiency though!

Haskell - pattern matching syntactic sugar and where

Often I have a function of such pattern:
f :: a -> b
f x = case x of
... -> g ...
... -> g ...
...
... -> g ...
where g = ...
There is an syntactic sugar for almost this case:
f :: a -> b
f ... = g ...
f ... = g ...
...
f ... = g ...
Unfortunately I can't attach my where to it: I'll obviously get bunch of not in scopes.
I can make g a separate function, but it's not nice: my module's namespace will be polluted with utility functions.
Is there any workaround?
I think that your first example isn't bad at all. The only syntactic weight is case x of, plus -> instead of =; the latter is offset by the fact that you can omit the function name for each clause. Indeed, even dflemstr's proposed go helper function is syntactically heavier.
Admittedly, it's slightly inconsistent compared to the normal function clause syntax, but this is probably a good thing: it more precisely visually delimits the scope in which x is available.
No, there is no workaround. When you have multiple clauses for a function like that, they cannot share a where-clause. Your only option is to use a case statement, or do something like this:
f x =
go x
where
go ... = g ...
go ... = g ...
g = ...
...if you really want to use a function form for some reason.
f = g . h -- h is most of your original f
where h ... = ...
h ... = ...
g =
From Haskell 2010 on, or with GHC you can also do:
f x
| m1 <- x = g
| m2 <- x = g
...
where g =
but note that you cannot use the variables bound in the patterns in g. It's equivalent to:
f x = let g = ... in case () of
() -> case x of
m1 -> g
_ -> case x of
m2 -> g
....
Your original solution seems to be the best and only workaround. Syntactically it's not any heavier than direct pattern matching on function parameters if not even lighter.
But just in case if what you need is just to check preconditions and not pattern match don't forget about guards, which allow you to access the where scope freely. But really I see nothing bad in your case of solution.
f :: a -> b
f a
| a == 2 = ...
| isThree a = ...
| a >= 4 = ...
| otherwise = ...
where isThree x = x == 3
With LambdaCase, you can also do this:
{-# language LambdaCase #-}
f :: a -> b
f = \case
... -> g ...
... -> g ...
...
... -> g ...
where g = ...
Is it safe to assume that you consistently use g on most, if not all, of the different branches of the case statement?
Operating with the assumption that f :: a -> b for some a and b (possibly polymorphic), g is necessarily some function of the form c -> d, which means that there must be a way to consistently extract a c out of an a. Call that getC :: a -> c. In that case, the solution would be to simply use h . g . getC for all cases, where h :: d -> b.
But suppose you can't always get the c out of an a. Perhaps a is of the form f c, where f is a Functor? Then you could fmap g :: f c -> f d, and then somehow transform f d into a b.
Just sort of rambling here, but fmap was the first thing that came to mind when I saw that you appeared to be applying g on every branch.

Resources