Rules firing on class methods - haskell

My apologies if this is a somewhat vague question, I just want to know I'm heading in the right direction. (GHC 8.0.2)
I've got a "list like" data type, lets call it T a. For example it's an instance of Foldable, Functor and Monoid.
There's a few rules I'd like to put into place, for example (roughly speaking, not using exact syntax here):
You can fold two lists appended just by folding the first then folding the second (there's no need to actually append the lists):
foldl' f z (x ++ y) -> foldl' f (foldl' f z x) y
If you're folding a list that's been mapped, you can just drag the map function into the fold, eliminating the map:
foldl' f z (fmap x) -> let h x y = f x (g y) in foldl h z x
If you want to map appended lists, just map the individual ones then append them:
fmap (x ++ y) -> fmap x ++ fmap y
How should I be doing this? What I've currently done is have my instance methods call a top level method, like fmap', which does all the work, and then I've applied the rules to these methods. However -ddump-rule-rewrites doesn't seem to show my rules firing, but there are a lot of Class op * rules firing, like Class op fmap and Class op foldl'
Should I be:
a. Doing what I'm doing?
b. Apply my rules on the class methods directly?
c. The standard class rules should cover the situations for my data type, there's no need to do anything? OR
d. Some combination of the above, involving a mixture of INLINE and NOINLINE pragmas at the appropriate state (please detail).
I suspect the answer is (d), I'd just like some guidance to get started.

Related

In Haskell, why is map a separate function to fmap? [duplicate]

Everywhere I've tried using map, fmap has worked as well. Why did the creators of Haskell feel the need for a map function? Couldn't it just be what is currently known as fmap and fmap could be removed from the language?
I would like to make an answer to draw attention to augustss's comment:
That's not actually how it happens. What happened was that the type of map was generalized to cover Functor in Haskell 1.3. I.e., in Haskell 1.3 fmap was called map. This change was then reverted in Haskell 1.4 and fmap was introduced. The reason for this change was pedagogical; when teaching Haskell to beginners the very general type of map made error messages more difficult to understand. In my opinion this wasn't the right way to solve the problem.
Haskell 98 is seen as a step backwards by some Haskellers (including me), previous versions having defined a more abstract and consistent library. Oh well.
Quoting from the Functor documentation at https://wiki.haskell.org/Typeclassopedia#Functor
You might ask why we need a separate map function. Why not just do
away with the current list-only map function, and rename fmap to map
instead? Well, that’s a good question. The usual argument is that
someone just learning Haskell, when using map incorrectly, would much
rather see an error about lists than about Functor.
They look the same on the application site but they're different, of course. When you apply either of those two functions, map or fmap, to a list of values they will produce the same result but that doesn't mean they're meant for the same purpose.
Run a GHCI session (the Glasgow Haskell Compiler Interactive) to query for information about those two functions, then have a look at their implementations and you will discover many differences.
map
Query GHCI for information about map
Prelude> :info map
map :: (a -> b) -> [a] -> [b] -- Defined in ‘GHC.Base’
and you'll see it defined as an high-order function applicable to a list of values of any type a yielding a list of values of any type b. Although polymorphic (the a and b in the above definition stand for any type) the map function is intended to be applied to a list of values which is just one possible data type amongst many others in Haskell. The map function could not be applied to something which is not a list of values.
As you can read from the GHC.Base source code, the map function is implemented as follows
map _ [] = []
map f (x:xs) = f x : map f xs
which makes use of pattern matching to pull the head (the x) off the tail (the xs) of the list, then constructs a new list by using the : (cons) value constructor so to prepend f x (read it as "f applied to x") to the recursion of map over the tail until the list is empty. It's worth noticing that the implementation of the mapfunction does not rely upon any other function but just on itself.
fmap
Now try to query for information about fmap and you'll see something quite different.
Prelude> :info fmap
class Functor (f :: * -> *) where
fmap :: (a -> b) -> f a -> f b
...
-- Defined in ‘GHC.Base’
This time fmap is defined as one of the functions whose implementations must be provided by those data types which wish to belong to the Functor type class. That means that there can be more than one data types, not only the "list of values" data type, able to provide an implementation for the fmap function. That makes fmap applicable to a much larger set of data types: the functors indeed!
As you can read from the GHC.Base source code, a possible implementation of the fmap function is the one provided by the Maybe data type:
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just a) = Just (f a)
and another possible implementation is the one provided by the 2-tuple data type
instance Functor ((,) a) where
fmap f (x,y) = (x, f y)
and another possible implementation is the one provided by the list data type (of course!):
instance Functor [] where
fmap f xs = map f xs
which relies upon the map function.
Conclusion
The map function can be applied to nothing more than list of values (where values are of any type) whereas the fmap function can be applied much more data types: all of those which belongs to the functor class (e.g. maybes, tuples, lists, etc.). Since the "list of values" data type is also a functor (because it provides an implementation for it) then fmap can be applied to is as well producing the very same result as map.
map (+3) [1..5]
fmap (+3) (Just 15)
fmap (+3) (5, 7)

Rewriting as a practical optimization technique in GHC: Is it really needed?

I was reading the paper authored by Simon Peyton Jones, et al. named “Playing by the Rules: Rewriting as a practical optimization technique in GHC”. In the second section, namely “The basic idea” they write:
Consider the familiar map function, that applies a function to each element of a list. Written in Haskell, map looks like this:
map f [] = []
map f (x:xs) = f x : map f xs
Now suppose that the compiler encounters the following call of map:
map f (map g xs)
We know that this expression is equivalent to
map (f . g) xs
(where “.” is function composition), and we know that the latter expression is more efficient than the former because there is no intermediate list. But the compiler has no such knowledge.
One possible rejoinder is that the compiler should be smarter --- but the programmer will always know things that the compiler cannot figure out. Another suggestion is this: allow the programmer to communicate such knowledge directly to the compiler. That is the direction we explore here.
My question is, why can't we make the compiler smarter? The authors say that “but the programmer will always know things that the compiler cannot figure out”. However, that's not a valid answer because the compiler can indeed figure out that map f (map g xs) is equivalent to map (f . g) xs, and here is how:
map f (map g xs)
map g xs unifies with map f [] = [].
Hence map g [] = [].
map f (map g []) = map f [].
map f [] unifies with map f [] = [].
Hence map f (map g []) = [].
map g xs unifies with map f (x:xs) = f x : map f xs.
Hence map g (x:xs) = g x : map g xs.
map f (map g (x:xs)) = map f (g x : map g xs).
map f (g x : map g xs) unifies with map f (x:xs) = f x : map f xs.
Hence map f (map g (x:xs)) = f (g x) : map f (map g xs).
Hence we now have the rules:
map f (map g []) = []
map f (map g (x:xs)) = f (g x) : map f (map g xs)
As you can see f (g x) is just (f . g) and map f (map g xs) is being called recursively. This is exactly the definition of map (f . g) xs. The algorithm for this automatic conversion seems to be pretty simple. So why not implement this instead of rewriting rules?
Aggressive inlining can derive many of the equalities that rewrite rules are short-hand for.
The differences is that inlining is "blind", so you don't know in advance if the result will be better or worse, or even if it will terminate.
Rewrite rules, however, can do completely non-obvious things, based on much higher level facts about the program. Think of rewrite rules as adding new axioms to the optimizer. By adding these you have a richer rule set to apply, making complicated optimizations easier to apply.
Stream fusion, for example, changes the data type representation. This cannot be expressed through inlining, as it involves a representation type change (we reframe the optimization problem in terms of the Stream ADT). Easy to state in rewrite rules, impossible with inlining alone.
Something in that direction was investigated in a Bachelor’s thesis of Johannes Bader, a student of mine: Finding Equations in Functional Programs (PDF file).
To some degree it is certainly possible, but
it is quite tricky. Finding such equations is in a sense as hard as finding proofs in a theorem proofer, and
it is not often very useful, because it tends to find equations that the programmer would rarely write directly.
It is however useful to clean up after other transformations such as inlining and various form of fusion.
This could be viewed as a balance between balancing expectations in the specific case, and balancing them in the general case. This balance can generate funny situations where you can know how to make something faster, but it is better for the language in general if you don't.
In the specific case of maps in the structure you give, the computer could find optimizations. However, what about related structures? What if the function isn't map? What if there's an additional layer of indirection, such as a function that returns map. In those cases, the compiler cannot optimize easily. This is the general case problem.
How if you do optimize the special case, one of two outcomes occurs
Nobody relies on it, because they aren't sure if it is there or not. In this case, articles like the one you quote get written
People do start relying on it, and now every developer is forced to remember "maps done in this configuration get automatically converted to the fast version for me, but if I do it in this configuration I don't.' This starts to manipulate the way people use the language, and can actually reduce readability!
Given the need for developers to think about such optimizations in the general case, we expect to see developers doing these optimizations in the simple case, decreasing the need to for the optimization in the first place!
Now, if it turns out that the particular case you are interested accounts for something massive like 2% of the world codebase in Haskell, there would be a much stronger argument for applying your special-case optimization.

Evaluation strategy

How should one reason about function evaluation in examples like the following in Haskell:
let f x = ...
x = ...
in map (g (f x)) xs
In GHC, sometimes (f x) is evaluated only once, and sometimes once for each element in xs, depending on what exactly f and g are. This can be important when f x is an expensive computation. It has just tripped a Haskell beginner I was helping and I didn't know what to tell him other than that it is up to the compiler. Is there a better story?
Update
In the following example (f x) will be evaluated 4 times:
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
With language extensions, we can create situations where f x must be evaluated repeatedly:
{-# LANGUAGE GADTs, Rank2Types #-}
module MultiEvG where
data BI where
B :: (Bounded b, Integral b) => b -> BI
foo :: [BI] -> [Integer]
foo xs = let f :: (Integral c, Bounded c) => c -> c
f x = maxBound - x
g :: (forall a. (Integral a, Bounded a) => a) -> BI -> Integer
g m (B y) = toInteger (m + y)
x :: (Integral i) => i
x = 3
in map (g (f x)) xs
The crux is to have f x polymorphic even as the argument of g, and we must create a situation where the type(s) at which it is needed can't be predicted (my first stab used an Either a b instead of BI, but when optimising, that of course led to only two evaluations of f x at most).
A polymorphic expression must be evaluated at least once for each type it is used at. That's one reason for the monomorphism restriction. However, when the range of types it can be needed at is restricted, it is possible to memoise the values at each type, and in some circumstances GHC does that (needs optimising, and I expect the number of types involved mustn't be too large). Here we confront it with what is basically an inhomogeneous list, so in each invocation of g (f x), it can be needed at an arbitrary type satisfying the constraints, so the computation cannot be lifted outside the map (technically, the compiler could still build a cache of the values at each used type, so it would be evaluated only once per type, but GHC doesn't, in all likelihood it wouldn't be worth the trouble).
Monomorphic expressions need only be evaluated once, they can be shared. Whether they are is up to the implementation; by purity, it doesn't change the semantics of the programme. If the expression is bound to a name, in practice you can rely on it being shared, since it's easy and obviously what the programmer wants. If it isn't bound to a name, it's a question of optimisation. With the bytecode generator or without optimisations, the expression will often be evaluated repeatedly, but with optimisations repeated evaluation would indicate a compiler bug.
Polymorphic expressions must be evaluated at least once for every type they're used at, but with optimisations, when GHC can see that it may be used multiple times at the same type, it will (usually) still be shared for that type during a larger computation.
Bottom line: Always compile with optimisations, help the compiler by binding expressions you want shared to a name, and give monomorphic type signatures where possible.
Your examples are indeed quite different.
In the first example, the argument to map is g (f x) and is passed once to map most likely as partially applied function.
Should g (f x), when applied to an argument within map evaluate its first argument, then this will be done only once and then the thunk (f x) will be updated with the result.
Hence, in your first example, f xwill be evaluated at most 1 time.
Your second example requires a deeper analysis before the compiler can arrive at the conclusion that (f x) is always constant in the lambda expression. Perhaps it will never optimize it at all, because it may have knowledge that trace is not quite kosher. So, this may evaluate 4 times when tracing, and 4 times or 1 time when not tracing.
This is really dependent on GHC's optimizations, as you've been able to tell.
The best thing to do is to study the GHC core that you get after optimizing the program. I would look at the generated Core and examine whether f x had its own let statement outside the map or not.
If you want to be sure, then you should factor f x out into its own variable assigned in a let, but there's not really a guaranteed way to figure it out other than reading through Core.
All that said, with the exception of things like trace that use unsafePerformIO, this will never change the semantics of your program: how it actually behaves.
In GHC without optimizations, the body of a function is evaluated every time the function is called. (A "call" means the function is applied to arguments and the result is evaluated.) In the following example, f x is inside a function, so it will execute each time the function is called.
(GHC may optimize this expression as discussed in the FAQ [1].)
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
However, if we move f x out of the function, it will execute only once.
let f x = trace "!" $ zip x x
x = "abc"
in map ((\f_x i -> lookup i f_x) (f x)) "abcd"
This can be rewritten more readably as
let f x = trace "!" $ zip x x
x = "abc"
g f_x i = lookup i f_x
in map (g (f x)) "abcd"
The general rule is that, each time a function is applied to an argument, a new "copy" of the function body is created. Function application is the only thing that may cause an expression to re-execute. However, be warned that some functions and function calls do not look like functions syntactically.
[1] http://www.haskell.org/haskellwiki/GHC/FAQ#Subexpression_Elimination

When are lambda forms necessary in Haskell?

I'm a newbie to Haskell, and a relative newbie to functional programming.
In other (besides Haskell) languages, lambda forms are often very useful.
For example, in Scheme:
(define (deriv-approx f)
(lambda (h x)
(/ (- (f (+ x h)
(f x)
h)))
Would create a closure (over the function f) to approximate a derivative (at value x, with interval h).
However, this usage of a lambda form doesn't seem to be necessary in Haskell, due to its partial application:
deriv-approx f h x = ( (f (x + h)) - (f x) ) / h
What are some examples where lambda forms are necessary in Haskell?
Edit: replaced 'closure' with 'lambda form'
I'm going to give two slightly indirect answers.
First, consider the following code:
module Lambda where
derivApprox f h x = ( (f (x + h)) - (f x) ) / h
I've compiled this while telling GHC to dump an intermediate representation, which is roughly a simplified version of Haskell used as part of the compilation process, to get this:
Lambda.derivApprox
:: forall a. GHC.Real.Fractional a => (a -> a) -> a -> a -> a
[LclIdX]
Lambda.derivApprox =
\ (# a) ($dFractional :: GHC.Real.Fractional a) ->
let {
$dNum :: GHC.Num.Num a
[LclId]
$dNum = GHC.Real.$p1Fractional # a $dFractional } in
\ (f :: a -> a) (h :: a) (x :: a) ->
GHC.Real./
# a
$dFractional
(GHC.Num.- # a $dNum (f (GHC.Num.+ # a $dNum x h)) (f x))
h
If you look past the messy annotations and verbosity, you should be able to see that the compiler has turned everything into lambda expressions. We can consider this an indication that you probably don't need to do so manually.
Conversely, let's consider a situation where you might need lambdas. Here's a function that uses a fold to compose a list of functions:
composeAll :: [a -> a] -> a -> a
composeAll = foldr (.) id
What's that? Not a lambda in sight! In fact, we can go the other way, as well:
composeAll' :: [a -> a] -> a -> a
composeAll' xs x = foldr (\f g x -> f (g x)) id xs x
Not only is this full of lambdas, it's also taking two arguments to the main function and, what's more, applying foldr to all of them. Compare the type of foldr, (a -> b -> b) -> b -> [a] -> b, to the above; apparently it takes three arguments, but above we've applied it to four! Not to mention that the accumulator function takes two arguments, but we have a three argument lambda here. The trick, of course, is that both are returning a function that takes a single argument; and we're simply applying that argument on the spot, instead of juggling lambdas around.
All of which, hopefully, has convinced you that the two forms are equivalent. Lambda forms are never necessary, or perhaps always necessary, because who can tell the difference?
There is no semantic difference between
f x y z w = ...
and
f x y = \z w -> ...
The main difference between expression style (explicit lambdas) and declaration style is a syntactic one. One situation where it matters is when you want to use a where clause:
f x y = \z w -> ...
where ... -- x and y are in scope, z and w are not
It is indeed possible to write any Haskell program without using an explicit lambda anywhere by replacing them with named local functions or partial application.
See also: Declaration vs. expression style.
When you can declare named curried functions (such as your Haskell deriv-approx) it is never necessary to use an explicit lambda expression. Every explicit lambda expression can be replaced with a partial application of a named function that takes the free variables of the lambda expression as its first parameters.
Why one would want to do this in source code is not easy to see, but some implementations essentially work that way.
Also, somewhat beside the point, would the following rewriting (different from what I've just described) count as avoiding lambdas for you?
deriv-approx f = let myfunc h x = (f(x+h)-(f x))/h in myfunc
If you only use a function once, e.g. as a parameter to map or foldr or some other higher-order function, then it is often better to use a lambda than a named function, because it immediately becomes clear that the function isn't used anywhere else - it can't be, because it doesn't have a name. When you introduce a new named function, you give people reading your code another thing to remember for the duration of the scope. So lambdas are never strictly speaking necessary, but they are often preferable to the alternative.

What's the point of map in Haskell, when there is fmap?

Everywhere I've tried using map, fmap has worked as well. Why did the creators of Haskell feel the need for a map function? Couldn't it just be what is currently known as fmap and fmap could be removed from the language?
I would like to make an answer to draw attention to augustss's comment:
That's not actually how it happens. What happened was that the type of map was generalized to cover Functor in Haskell 1.3. I.e., in Haskell 1.3 fmap was called map. This change was then reverted in Haskell 1.4 and fmap was introduced. The reason for this change was pedagogical; when teaching Haskell to beginners the very general type of map made error messages more difficult to understand. In my opinion this wasn't the right way to solve the problem.
Haskell 98 is seen as a step backwards by some Haskellers (including me), previous versions having defined a more abstract and consistent library. Oh well.
Quoting from the Functor documentation at https://wiki.haskell.org/Typeclassopedia#Functor
You might ask why we need a separate map function. Why not just do
away with the current list-only map function, and rename fmap to map
instead? Well, that’s a good question. The usual argument is that
someone just learning Haskell, when using map incorrectly, would much
rather see an error about lists than about Functor.
They look the same on the application site but they're different, of course. When you apply either of those two functions, map or fmap, to a list of values they will produce the same result but that doesn't mean they're meant for the same purpose.
Run a GHCI session (the Glasgow Haskell Compiler Interactive) to query for information about those two functions, then have a look at their implementations and you will discover many differences.
map
Query GHCI for information about map
Prelude> :info map
map :: (a -> b) -> [a] -> [b] -- Defined in ‘GHC.Base’
and you'll see it defined as an high-order function applicable to a list of values of any type a yielding a list of values of any type b. Although polymorphic (the a and b in the above definition stand for any type) the map function is intended to be applied to a list of values which is just one possible data type amongst many others in Haskell. The map function could not be applied to something which is not a list of values.
As you can read from the GHC.Base source code, the map function is implemented as follows
map _ [] = []
map f (x:xs) = f x : map f xs
which makes use of pattern matching to pull the head (the x) off the tail (the xs) of the list, then constructs a new list by using the : (cons) value constructor so to prepend f x (read it as "f applied to x") to the recursion of map over the tail until the list is empty. It's worth noticing that the implementation of the mapfunction does not rely upon any other function but just on itself.
fmap
Now try to query for information about fmap and you'll see something quite different.
Prelude> :info fmap
class Functor (f :: * -> *) where
fmap :: (a -> b) -> f a -> f b
...
-- Defined in ‘GHC.Base’
This time fmap is defined as one of the functions whose implementations must be provided by those data types which wish to belong to the Functor type class. That means that there can be more than one data types, not only the "list of values" data type, able to provide an implementation for the fmap function. That makes fmap applicable to a much larger set of data types: the functors indeed!
As you can read from the GHC.Base source code, a possible implementation of the fmap function is the one provided by the Maybe data type:
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just a) = Just (f a)
and another possible implementation is the one provided by the 2-tuple data type
instance Functor ((,) a) where
fmap f (x,y) = (x, f y)
and another possible implementation is the one provided by the list data type (of course!):
instance Functor [] where
fmap f xs = map f xs
which relies upon the map function.
Conclusion
The map function can be applied to nothing more than list of values (where values are of any type) whereas the fmap function can be applied much more data types: all of those which belongs to the functor class (e.g. maybes, tuples, lists, etc.). Since the "list of values" data type is also a functor (because it provides an implementation for it) then fmap can be applied to is as well producing the very same result as map.
map (+3) [1..5]
fmap (+3) (Just 15)
fmap (+3) (5, 7)

Resources