In Haskell, we have Data.Function.on:
on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
(.*.) `on` f = \x y -> f x .*. f y
In Clojure, I want to be able to define, for example,
an anagram predicate as follows:
(defn anagram? [word other-word]
(and (not= word other-word)
((on = sort) word other-word)))
It's trivial to implement:
(defn on [g f] (fn [x y] (g (f x) (f y))))
But is there any built-in function
that accomplishes the same goal?
I can't seem to find one.
No, there is no built-in that does what you are looking for. If you are going to implement it, though, I think you can afford to be a little more generic, since Clojure has vararg support and lacks currying:
(defn on
([f g]
(fn [x y]
(f (g x)
(g y))))
([f g & args]
(on f #(apply g % args))))
This lets you write something like
(defn same-parity? [x y]
((on = mod 2) x y))
which of course is easy in Haskell too, as
sameParity :: (Integral a) => a -> a -> Bool
sameParity = (==) `on` (`mod` 2)
But in Clojure the partial application of mod is a little trickier, so it's customary to provide equivalent functionality via &args if you can.
I have a witness type for type-level lists,
data List xs where
Nil :: List '[]
Cons :: proxy x -> List xs -> List (x ': xs)
as well as the following utilities.
-- Type level append
type family xs ++ ys where
'[] ++ ys = ys
(x ': xs) ++ ys = x ': (xs ++ ys)
-- Value level append
append :: List xs -> List ys -> List (xs ++ ys)
append Nil ys = ys
append (Cons x xs) ys = Cons x (append xs ys)
-- Proof of associativity of (++)
assoc :: List xs -> proxy ys -> proxy' zs -> ((xs ++ ys) ++ zs) :~: (xs ++ (ys ++ zs))
assoc Nil _ _ = Refl
assoc (Cons _ xs) ys zs = case assoc xs ys zs of Refl -> Refl
Now, I have two different but equivalent definitions of a type-level reverse function,
-- The first version, O(n)
type Reverse xs = Rev '[] xs
type family Rev acc xs where
Rev acc '[] = acc
Rev acc (x ': xs) = Rev (x ': acc) xs
-- The second version, O(n²)
type family Reverse' xs where
Reverse' '[] = '[]
Reverse' (x ': xs) = Reverse' xs ++ '[x]
The first is more efficient, but the second is easier to use when proving things to the compiler, so it would be nice to have a proof of equivalence. In order to do this, I need a proof of Rev acc xs :~: Reverse' xs ++ acc. This is what I came up with:
revAppend :: List acc -> List xs -> Rev acc xs :~: Reverse' xs ++ acc
revAppend _ Nil = Refl
revAppend acc (Cons x xs) =
case (revAppend (Cons x acc) xs, assoc (reverse' xs) (Cons x Nil) acc) of
(Refl, Refl) -> Refl
reverse' :: List xs -> List (Reverse' xs)
reverse' Nil = Nil
reverse' (Cons x xs) = append (reverse' xs) (Cons x Nil)
Unfortunately, revAppend is O(n³), which completely defeats the purpose of this exercise. However, we can bypass all this and get O(1) by using unsafeCoerce:
revAppend :: Rev acc xs :~: Reverse' xs ++ acc
revAppend = unsafeCoerce Refl
Is this safe? What about the general case? For example, if I have two type families F :: k -> * and G :: k -> *, and I know that they are equivalent, is it safe to define the following?
equal :: F a :~: G a
equal = unsafeCoerce Refl
It would be very nice if GHC used a termination checker on expressions e::T where T has only one constructor with no arguments K (e.g. :~:, ()). When the check succeeds, GHC could rewrite e as K skipping the computation completely. You would have to rule out FFI, unsafePerformIO, trace, ... but it seems feasible. If this were implemented, it would solve the posted question very nicely, allowing one to actually write proofs having zero runtime cost.
Failing this, you can use unsafeCoerce in the meanwhile, as you propose. If you are really, really sure that two type are the same you can use it safely. The typical example is implementing Data.Typeable. Of course, a misuse of unsafeCoerce on different types would lead to unpredictable effects, hopefully a crash.
You could even write your own "safer" variant of unsafeCoerce:
unsafeButNotSoMuchCoerce :: (a :~: b) -> a -> b
#ifdef CHECK_TYPEEQ
unsafeButNotSoMuchCoerce Refl = id
#else
unsafeButNotSoMuchCoerce _ = unsafeCoerce
#endif
If CHECK_TYPEEQ is defined it leads to slower code. If undefined, it skips it and coerces at zero cost. In the latter case it is still unsafe because you can pass bottom as the first arg and the program will not loop but will instead perform the wrong coercion. In this way you can test your program with the safe but slow mode, and then turn to the unsafe mode and pray your "proofs" were always terminating.
By the task we've had to implement foldl by foldr. By comparing both function signatures and foldl implementation I came with the following solution:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl _ acc [] = acc
myFoldl fn acc (x:xs) = foldr fn' (fn' x acc) xs
where
fn' = flip fn
Just flip function arguments to satisfy foldr expected types and mimic foldl definition by recursively applying passed function.
It was a surprise as my teacher rated this answer with zero points.
I even checked this definition stacks its intermediate results in the same way as the standard foldl:
> myFoldl (\a elm -> concat ["(",a,"+",elm,")"]) "" (map show [1..10])
> "((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
> foldl (\a elm -> concat ["(",a,"+",elm,")"]) "" (map show [1..10])
> "((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
The correct answer was the following defintion:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
Just asking why is my previous definition incorrect ?
Essentially, your fold goes in the wrong order. I think you didn't copy your output from foldl correctly; I get the following:
*Main> myFoldl (\ a elem -> concat ["(", a, "+", elem, ")"]) "" (map show [1..10])
"((((((((((+1)+10)+9)+8)+7)+6)+5)+4)+3)+2)"
*Main> foldl (\ a elem -> concat ["(", a, "+", elem, ")"]) "" (map show [1..10])
"((((((((((+1)+2)+3)+4)+5)+6)+7)+8)+9)+10)"
so what happens is that your implementation gets the first element--the base case--correct but then uses foldr for the rest which results in everything else being processed backwards.
There are some nice pictures of the different orders the folds work in on the Haskell wiki:
This shows how foldr (:) [] should be the identity for lists and foldl (flip (:)) [] should reverse a list. In your case, all it does is put the first element at the end but leaves everything else in the same order. Here is exactly what I mean:
*Main> foldl (flip (:)) [] [1..10]
[10,9,8,7,6,5,4,3,2,1]
*Main> myFoldl (flip (:)) [] [1..10]
[2,3,4,5,6,7,8,9,10,1]
This brings us to a deeper and far more important point--even in Haskell, just because the types line up does not mean your code works. The Haskell type system is not omnipotent and there are often many--even an infinite number of--functions that satisfy any given type. As a degenerate example, even the following definition of myFoldl type-checks:
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl _ acc _ = acc
So you have to think about exactly what your function is doing even if the types match. Thinking about things like folds might be confusing for a while, but you'll get used to it.
Suppose we want those elements of list x for which the corresponding element of list y is strictly positive. Any of the three solutions below work:
let x = [1..4]
let y = [1, -1, 2, -2]
[ snd both | both <- zip (map (> 0) y) x, fst both ]
or
map snd $ filter fst $ zip (map (>0) y) x
or
sel :: [Bool] -> [a] -> [a]
sel [] _ = []
sel (True : xs) (y : ys) = y : sel xs ys
sel (False : xs) (y : ys) = sel xs ys
sel (map (> 0) y) x
however, what prompted this was that in the R language this can be written compactly like this:
x[y > 0]
and given how much shorter that is I was wondering if there is a shorter/better way to do this in Haskell?
I'm not a haskell specialist, but why not use list comprehension?
[i | (i,j) <- zip x y, j > 0 ]
If you are willing to use a language extension, I can offer the alternative
{-# LANGUAGE ParallelListComp #-}
bfilter :: (b -> Bool) -> [a] -> [b] -> [a]
bfilter cond xs ys = [x | x <- xs | y <- ys, cond y]
Nothing in Haskell will be nearly as short as the R version, because in R, it's a language built-in, but in Haskell it isn't. Apparently whoever designed R found there to be good reasons to include such a primitive, but none of the Haskell designers found there to be convincing reasons to include such a construct in the language (and it wouldn't fit in nicely, so I fully endorse that decision - it may fit in well in R, I don't know that language).
zip x y >>= \(a,b) -> filter(const(b>0)) [a]
Or pointlessly using Applicative...
import Control.Applicative
zip x y >>= filter <$> const.(>0).snd <*> (:[]).fst
As Daniel Fischer says, there isn't any special syntax for this.
If you're going to be doing this operation often, it's best to define your own single reusable function, instead of having to assemble the list comprehension or map/filter chain manually every time. (Your sel doesn't pass this test because the caller has to apply the map separately.)
So
selectWhere :: [a] -> (a -> Bool) -> [b] -> [b]
selectWhere ys pred = map snd . filter (pred . fst) . zip ys
-- call it like this: selectWhere y (> 0) x
or whichever clearer definition you prefer. The important thing is that you wrap it up inside a function.
I'm currently on chapter 4 of Real World Haskell, and I'm trying to wrap my head around implementing foldl in terms of foldr.
(Here's their code:)
myFoldl :: (a -> b -> a) -> a -> [b] -> a
myFoldl f z xs = foldr step id xs z
where step x g a = g (f a x)
I thought I'd try to implement zip using the same technique, but I don't seem to be making any progress. Is it even possible?
zip2 xs ys = foldr step done xs ys
where done ys = []
step x zipsfn [] = []
step x zipsfn (y:ys) = (x, y) : (zipsfn ys)
How this works: (foldr step done xs) returns a function that consumes
ys; so we go down the xs list building up a nested composition of
functions that will each be applied to the corresponding part of ys.
How to come up with it: I started with the general idea (from similar
examples seen before), wrote
zip2 xs ys = foldr step done xs ys
then filled in each of the following lines in turn with what it had to
be to make the types and values come out right. It was easiest to
consider the simplest cases first before the harder ones.
The first line could be written more simply as
zip2 = foldr step done
as mattiast showed.
The answer had already been given here, but not an (illustrative) derivation. So even after all these years, perhaps it's worth adding it.
It is actually quite simple. First,
foldr f z xs
= foldr f z [x1,x2,x3,...,xn] = f x1 (foldr f z [x2,x3,...,xn])
= ... = f x1 (f x2 (f x3 (... (f xn z) ...)))
hence by eta-expansion,
foldr f z xs ys
= foldr f z [x1,x2,x3,...,xn] ys = f x1 (foldr f z [x2,x3,...,xn]) ys
= ... = f x1 (f x2 (f x3 (... (f xn z) ...))) ys
As is apparent here, if f is non-forcing in its 2nd argument, it gets to work first on x1 and ys, f x1r1ys where r1 =(f x2 (f x3 (... (f xn z) ...)))= foldr f z [x2,x3,...,xn].
So, using
f x1 r1 [] = []
f x1 r1 (y1:ys1) = (x1,y1) : r1 ys1
we arrange for passage of information left-to-right along the list, by calling r1 with the rest of the input list ys1, foldr f z [x2,x3,...,xn]ys1 = f x2r2ys1, as the next step. And that's that.
When ys is shorter than xs (or the same length), the [] case for f fires and the processing stops. But if ys is longer than xs then f's [] case won't fire and we'll get to the final f xnz(yn:ysn) application,
f xn z (yn:ysn) = (xn,yn) : z ysn
Since we've reached the end of xs, the zip processing must stop:
z _ = []
And this means the definition z = const [] should be used:
zip xs ys = foldr f (const []) xs ys
where
f x r [] = []
f x r (y:ys) = (x,y) : r ys
From the standpoint of f, r plays the role of a success continuation, which f calls when the processing is to continue, after having emitted the pair (x,y).
So r is "what is done with more ys when there are more xs", and z = const [], the nil-case in foldr, is "what is done with ys when there are no more xs". Or f can stop by itself, returning [] when ys is exhausted.
Notice how ys is used as a kind of accumulating value, which is passed from left to right along the list xs, from one invocation of f to the next ("accumulating" step being, here, stripping a head element from it).
Naturally this corresponds to the left fold, where an accumulating step is "applying the function", with z = id returning the final accumulated value when "there are no more xs":
foldl f a xs =~ foldr (\x r a-> r (f a x)) id xs a
Similarly, for finite lists,
foldr f a xs =~ foldl (\r x a-> r (f x a)) id xs a
And since the combining function gets to decide whether to continue or not, it is now possible to have left fold that can stop early:
foldlWhile t f a xs = foldr cons id xs a
where
cons x r a = if t x then r (f a x) else a
or a skipping left fold, foldlWhen t ..., with
cons x r a = if t x then r (f a x) else r a
etc.
I found a way using quite similar method to yours:
myzip = foldr step (const []) :: [a] -> [b] -> [(a,b)]
where step a f (b:bs) = (a,b):(f bs)
step a f [] = []
For the non-native Haskellers here, I've written a Scheme version of this algorithm to make it clearer what's actually happening:
> (define (zip lista listb)
((foldr (lambda (el func)
(lambda (a)
(if (empty? a)
empty
(cons (cons el (first a)) (func (rest a))))))
(lambda (a) empty)
lista) listb))
> (zip '(1 2 3 4) '(5 6 7 8))
(list (cons 1 5) (cons 2 6) (cons 3 7) (cons 4 8))
The foldr results in a function which, when applied to a list, will return the zip of the list folded over with the list given to the function. The Haskell hides the inner lambda because of lazy evaluation.
To break it down further:
Take zip on input: '(1 2 3)
The foldr func gets called with
el->3, func->(lambda (a) empty)
This expands to:
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a))))
If we were to return this now, we'd have a function which takes a list of one element
and returns the pair (3 element):
> (define f (lambda (a) (cons (cons 3 (first a)) ((lambda (a) empty) (rest a)))))
> (f (list 9))
(list (cons 3 9))
Continuing, foldr now calls func with
el->3, func->f ;using f for shorthand
(lambda (a) (cons (cons el (first a)) (func (rest a))))
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
This is a func which takes a list with two elements, now, and zips them with (list 2 3):
> (define g (lambda (a) (cons (cons 2 (first a)) (f (rest a)))))
> (g (list 9 1))
(list (cons 2 9) (cons 3 1))
What's happening?
(lambda (a) (cons (cons 2 (first a)) (f (rest a))))
a, in this case, is (list 9 1)
(cons (cons 2 (first (list 9 1))) (f (rest (list 9 1))))
(cons (cons 2 9) (f (list 1)))
And, as you recall, f zips its argument with 3.
And this continues etc...
The problem with all these solutions for zip is that they only fold over one list or the other, which can be a problem if both of them are "good producers", in the parlance of list fusion. What you actually need is a solution that folds over both lists. Fortunately, there is a paper about exactly that, called "Coroutining Folds with Hyperfunctions".
You need an auxiliary type, a hyperfunction, which is basically a function that takes another hyperfunction as its argument.
newtype H a b = H { invoke :: H b a -> b }
The hyperfunctions used here basically act like a "stack" of ordinary functions.
push :: (a -> b) -> H a b -> H a b
push f q = H $ \k -> f $ invoke k q
You also need a way to put two hyperfunctions together, end to end.
(.#.) :: H b c -> H a b -> H a c
f .#. g = H $ \k -> invoke f $ g .#. k
This is related to push by the law:
(push f x) .#. (push g y) = push (f . g) (x .#. y)
This turns out to be an associative operator, and this is the identity:
self :: H a a
self = H $ \k -> invoke k self
You also need something that disregards everything else on the "stack" and returns a specific value:
base :: b -> H a b
base b = H $ const b
And finally, you need a way to get a value out of a hyperfunction:
run :: H a a -> a
run q = invoke q self
run strings all of the pushed functions together, end to end, until it hits a base or loops infinitely.
So now you can fold both lists into hyperfunctions, using functions that pass information from one to the other, and assemble the final value.
zip xs ys = run $ foldr (\x h -> push (first x) h) (base []) xs .#. foldr (\y h -> push (second y) h) (base Nothing) ys where
first _ Nothing = []
first x (Just (y, xys)) = (x, y):xys
second y xys = Just (y, xys)
The reason why folding over both lists matters is because of something GHC does called list fusion, which is talked about in the GHC.Base module, but probably should be much more well-known. Being a good list producer and using build with foldr can prevent lots of useless production and immediate consumption of list elements, and can expose further optimizations.
I tried to understand this elegant solution myself, so I tried to derive the types and evaluation myself. So, we need to write a function:
zip xs ys = foldr step done xs ys
Here we need to derive step and done, whatever they are. Recall foldr's type, instantiated to lists:
foldr :: (a -> state -> state) -> state -> [a] -> state
However our foldr invocation must be instantiated to something like below, because we must accept not one, but two list arguments:
foldr :: (a -> ? -> ?) -> ? -> [a] -> [b] -> [(a,b)]
Because -> is right-associative, this is equivalent to:
foldr :: (a -> ? -> ?) -> ? -> [a] -> ([b] -> [(a,b)])
Our ([b] -> [(a,b)]) corresponds to state type variable in the original foldr type signature, therefore we must replace every occurrence of state with it:
foldr :: (a -> ([b] -> [(a,b)]) -> ([b] -> [(a,b)]))
-> ([b] -> [(a,b)])
-> [a]
-> ([b] -> [(a,b)])
This means that arguments that we pass to foldr must have the following types:
step :: a -> ([b] -> [(a,b)]) -> [b] -> [(a,b)]
done :: [b] -> [(a,b)]
xs :: [a]
ys :: [b]
Recall that foldr (+) 0 [1,2,3] expands to:
1 + (2 + (3 + 0))
Therefore if xs = [1,2,3] and ys = [4,5,6,7], our foldr invocation would expand to:
1 `step` (2 `step` (3 `step` done)) $ [4,5,6,7]
This means that our 1 `step` (2 `step` (3 `step` done)) construct must create a recursive function that would go through [4,5,6,7] and zip up the elements. (Keep in mind, that if one of the original lists is longer, the excess values are thrown away). IOW, our construct must have the type [b] -> [(a,b)].
3 `step` done is our base case, where done is an initial value, like 0 in foldr (+) 0 [1..3]. We don't want to zip anything after 3, because 3 is the final value of xs, so we must terminate the recursion. How do you terminate the recursion over list in the base case? You return empty list []. But recall done type signature:
done :: [b] -> [(a,b)]
Therefore we can't return just [], we must return a function that would ignore whatever it receives. Therefore use const:
done = const [] -- this is equivalent to done = \_ -> []
Now let's start figuring out what step should be. It combines a value of type a with a function of type [b] -> [(a,b)] and returns a function of type [b] -> [(a,b)].
In 3 `step` done, we know that the result value that would later go to our zipped list must be (3,6) (knowing from original xs and ys). Therefore 3 `step` done must evaluate into:
\(y:ys) -> (3,y) : done ys
Remember, we must return a function, inside which we somehow zip up the elements, the above code is what makes sense and typechecks.
Now that we assumed how exactly step should evaluate, let's continue the evaluation. Here's how all reduction steps in our foldr evaluation look like:
3 `step` done -- becomes
(\(y:ys) -> (3,y) : done ys)
2 `step` (\(y:ys) -> (3,y) : done ys) -- becomes
(\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys)
1 `step` (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) -- becomes
(\(y:ys) -> (1,y) : (\(y:ys) -> (2,y) : (\(y:ys) -> (3,y) : done ys) ys) ys)
The evaluation gives rise to this implementation of step (note that we account for ys running out of elements early by returning an empty list):
step x f = \[] -> []
step x f = \(y:ys) -> (x,y) : f ys
Thus, the full function zip is implemented as follows:
zip :: [a] -> [b] -> [(a,b)]
zip xs ys = foldr step done xs ys
where done = const []
step x f [] = []
step x f (y:ys) = (x,y) : f ys
P.S.: If you are inspired by elegance of folds, read Writing foldl using foldr and then Graham Hutton's A tutorial on the universality and expressiveness of fold.
A simple approach:
lZip, rZip :: Foldable t => [b] -> t a -> [(a, b)]
-- implement zip using fold?
lZip xs ys = reverse.fst $ foldl f ([],xs) ys
where f (zs, (y:ys)) x = ((x,y):zs, ys)
-- Or;
rZip xs ys = fst $ foldr f ([],reverse xs) ys
where f x (zs, (y:ys)) = ((x,y):zs, ys)