How does evaluation in Haskell work, for expressions with constraints - haskell

Suppose I write in GHCi:
GHCi> let x = 1 + 2 :: Integer
GHCi> seq x ()
GHCi> :sprint x
GHCi prints x = 3 as naturally expected.
However,
GHCi> let x = 1 + 2
GHCi> seq x ()
GHCi> :sprint x
yields x = _
The sole difference between the two expressions are their types (Integer vs Num a => a). My question is what exactly happens, and why is seemingly x not evaluated in the latter example.

The main issue is that
let x = 1 + 2
defines a polymorphic value of type forall a. Num a => a, and that is something which evaluates similarly to a function.
Each use of x can be made at a different type, e.g. x :: Int, x :: Integer, x :: Double and so on. These results are not "cached" in any way, but recomputed every time, as if x were a function which is called multiple times, so to speak.
Indeed, a common implementation of type classes implements such a polymorphic x as a function
x :: NumDict a -> a
where the NumDict a argument above is added by the compiler automatically, and carries information about a being a Num type, including how to perform addition, how to interpret integer literals inside a, and so on. This is called the "dictionary-passing" implementation.
So, using a polymorphic x multiple times indeed corresponds to invoking a function multiple times, causing recomputation. To avoid this, the (dreaded) Monomorphism Restriction was introduced in Haskell, forcing x to be monomorphic instead. The MR is not a perfect solution, and can create some surprising type errors in certain cases.
To alleviate this issue, the MR is disabled by default in GHCi, since in GHCi we don't care that much about performance -- usability is more important there. This however causes the recomputation to reappear, as you discovered.

Related

How can Haskell integer literals be comparable without being in the Eq class?

In Haskell (at least with GHC v8.8.4), being in the Num class does NOT imply being in the Eq class:
$ ghci
GHCi, version 8.8.4: https://www.haskell.org/ghc/ :? for help
λ>
λ> let { myEqualP :: Num a => a -> a -> Bool ; myEqualP x y = x==y ; }
<interactive>:6:60: error:
• Could not deduce (Eq a) arising from a use of ‘==’
from the context: Num a
bound by the type signature for:
myEqualP :: forall a. Num a => a -> a -> Bool
at <interactive>:6:7-41
Possible fix:
add (Eq a) to the context of
the type signature for:
myEqualP :: forall a. Num a => a -> a -> Bool
• In the expression: x == y
In an equation for ‘myEqualP’: myEqualP x y = x == y
λ>
It seems this is because for example Num instances can be defined for some functional types.
Furthermore, if we prevent ghci from overguessing the type of integer literals, they have just the Num type constraint:
λ>
λ> :set -XNoMonomorphismRestriction
λ>
λ> x=42
λ> :type x
x :: Num p => p
λ>
Hence, terms like x or 42 above have no reason to be comparable.
But still, they happen to be:
λ>
λ> y=43
λ> x == y
False
λ>
Can somebody explain this apparent paradox?
Integer literals can't be compared without using Eq. But that's not what is happening, either.
In GHCi, under NoMonomorphismRestriction (which is default in GHCi nowadays; not sure about in GHC 8.8.4) x = 42 results in a variable x of type forall p :: Num p => p.1
Then you do y = 43, which similarly results in the variable y having type forall q. Num q => q.2
Then you enter x == y, and GHCi has to evaluate in order to print True or False. That evaluation cannot be done without picking a concrete type for both p and q (which has to be the same). Each type has its own code for the definition of ==, so there's no way to run the code for == without deciding which type's code to use.3
However each of x and y can be used as any type in Num (because they have a definition that works for all of them)4. So we can just use (x :: Int) == y and the compiler will determine that it should use the Int definition for ==, or x == (y :: Double) to use the Double definition. We can even do this repeatedly with different types! None of these uses change the type of x or y; we're just using them each time at one of the (many) types they support.
Without the concept of defaulting, a bare x == y would just produce an Ambiguous type variable error from the compiler. The language designers thought that would be extremely common and extremely annoying with numeric literals in particular (because the literals are polymorphic, but as soon as you do any operation on them you need a concrete type). So they introduced rules that some ambiguous type variables should be defaulted to a concrete type if that allows compilation to continue.5
So what is actually happening when you do x == y is that the compiler is just picking Integer to use for x and y in that particular expression, because you haven't given it enough information to pin down any particular type (and because the defaulting rules apply in this situation). Integer has an Eq instance so it can use that, even though the most general types of x and y don't include the Eq constraint. Without picking something it couldn't possibly even attempt to call == (and of course the "something" it picks has to be in Eq or it still won't work).
If you turn on -Wtype-defaults (which is included in -Wall), the compiler will print a warning whenever it applies defaulting6, which makes the process more visible.
1 The forall p part is implicit in standard Haskell, because all type variables are automatically introduced with forall at the beginning of the type expression in which they appear. You have to turn on extensions to even write the forall manually; either ExplicitForAll just for the ability to write forall, or any one of the many extensions that actually add functionality that makes forall useful to write explicitly.
2 GHCi will probably pick p again for the type variable, rather than q. I'm just using a different one to emphasise that they're different variables.
3 Technically it's not each type that necessarily has a different ==, but each Eq instance. Some of those instances are polymorphic, so they apply to multiple types, but that only really comes up with types that have some structure (like Maybe a, etc). Basic types like Int, Integer, Double, Char, Bool, each have their own instance, and each of those instances has its own code for ==.
4 In the underlying system, a type like forall p. Num p => p is in fact much like a function; one that takes a Num instance for a concrete type as a parameter. To get a concrete value you have to first "apply the function" to a type's Num instance, and only then do you get an actual value that could be printed, compared with other things, etc. In standard Haskell these instance parameters are always invisibly passed around by the compiler; some extensions allow you to manipulate this process a little more directly.
This is the root of what's confusing about why x == y works when x and y are polymorphic variables. If you had to explicitly pass around the type/instance arguments it would be obvious what's going on here, because you would have to manually apply both x and y to something and compare the results.
5 The gist of the default rules is that if the constraints on an ambiguous type variable are:
all built-in classes
at least one of them is a numeric class (Num, Floating, etc)
then GHC will try Integer to see if that type checks and allows all other constraints to be resolved. If that doesn't work it will try Double, and if that doesn't work then it reports an error.
You can set the types it will try with a default declaration (the "default default" being default (Integer, Double)), but you can't customise the conditions under which it will try to default things, so changing the default types is of limited use in my experience.
GHCi however comes with extended default rules that are a bit more useful in an interpreter (because it has to do type inference line-by-line instead of on the whole module at once). You can turn those on in compiled code with ExtendedDefaultRules extension (or turn them off in GHCi with NoExtendedDefaultRules), but again, neither of those options is particularly useful in my experience. It's annoying that the interpreter and the compiler behave differently, but the fundamental difference between module-at-a-time compilation and line-at-a-time interpretation mean that switching either's default rules to work consistently with the other is even more annoying. (This is also why NoMonomorphismRestriction is in effect by default in the interpreter now; the monomorphism restriction does a decent job at achieving its goals in compiled code but is almost always wrong in interpreter sessions).
6 You can also use a typed hole in combination with the asTypeOf helper to get GHC to tell you what type it's inferring for a sub-expression like this:
λ :t x
x :: Num p => p
λ :t y
y :: Num p => p
λ (x `asTypeOf` _) == y
<interactive>:19:15: error:
• Found hole: _ :: Integer
• In the second argument of ‘asTypeOf’, namely ‘_’
In the first argument of ‘(==)’, namely ‘(x `asTypeOf` _)’
In the expression: (x `asTypeOf` _) == y
• Relevant bindings include
it :: Bool (bound at <interactive>:19:1)
Valid hole fits include
x :: forall p. Num p => p
with x
(defined at <interactive>:1:1)
it :: forall p. Num p => p
with it
(defined at <interactive>:10:1)
y :: forall p. Num p => p
with y
(defined at <interactive>:12:1)
You can see it tells us nice and simply Found hole: _ :: Integer, before proceeding with all the extra information it likes to give us about errors.
A typed hole (in its simplest form) just means writing _ in place of an expression. The compiler errors out on such an expression, but it tries to give you information about what you could use to "fill in the blank" in order to get it to compile; most helpfully, it tells you the type of something that would be valid in that position.
foo `asTypeOf` bar is an old pattern for adding a bit of type information. It returns foo but it restricts (this particular usage of) it to be the same type as bar (the actual value of bar is totally unused). So if you already have a variable d with type Double, x `asTypeOf` d will be the value of x as a Double.
Here I'm using asTypeOf "backwards"; instead of using the thing on the right to constrain the type of the thing on the left, I'm putting a hole on the right (which could have any type), but asTypeOf conveniently makes sure it's the same type as x without otherwise changing how x is used in the overall expression (so the same type inference still applies, including defaulting, which isn't always the case if you lift a small part of a larger expression out to ask GHCi for its type with :t; in particular :t x won't tell us Integer, but Num p => p).

Why the pointfree style does not cause a problem?

I read about The Monomorphism Restriction from the page https://www.haskell.org/tutorial/pitfalls.html and could not understand the last point:
A common violation of the restriction happens with functions defined
in a higher-order manner, as in this definition of sum from the
Standard Prelude:
sum = foldl (+) 0
As is, this would cause a static type error. We can fix the problem by
adding the type signature:
sum :: (Num a) => [a] -> a
Also note that this problem would not have arisen if we had written:
sum xs = foldl (+) 0 xs
because the restriction only applies to pattern bindings.
Why the last point does not cause any error?
because the restriction only applies to pattern bindings.
Essentially, the MR does not apply when we are defining a function using a function binding of the form
f arg1 ... argN = ...
with N > 0.
The intuition is as follows. The purpose of the MR is to avoid turning Haskell non-functions into lower-level functions accidentally. For instance,
x = 3 + 4
is not a function. However, its type is Num a => a, which is usually implemented as a function from a Num dictionary to the result of 3+4 where + is a function defined by the dictionary. This can lead to a bad performance, since every time we use x the sum will need to be recomputed from scratch. This is unavoidable if we want to compute print (x :: Int) >> print (x :: Double), for instance. But actually using x at different types is rather uncommon.
So, the MR makes x monomorphic, preventing us to use it at more than a single type. In that way, recomputation can be avoided.
However, if x is already a function there is no harm in keeping that polymorphic, since we are "recomputing" function calls anyway. So, the MR does not apply to function bindings.

Type inference interferes with referential transparency

What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.
Consider the expression
(7^7^7`mod`5`mod`2)
And I want to know whether or not this expression is 1. For my safety, I will do perform this twice:
( (7^7^7`mod`5`mod`2)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
which now gives (True,False) with GHCi 7.4.1.
Evidently, this expression is now referentially opaque. How can I tell whether or not a program is subject to such behavior? I can inundate the program with :: all over but that does not make it very readable. Is there any other class of Haskell programs in between that I miss? That is between a fully annotated and an unannotated one?
(Apart from the only somewhat related question I found on SO there must be something else on this)
I do not think there's any guarantee that evaluating a polymorphically typed expression such as 5 at different types will produce "compatible" results, for any reasonable definition of "compatible".
GHCi session:
> class C a where num :: a
> instance C Int where num = 0
> instance C Double where num = 1
> num + length [] -- length returns an Int
0
> num + 0 -- GHCi defaults to Double for some reason
1.0
This looks as it's breaking referential transparency since length [] and 0 should be equal, but under the hood it's num that's being used at different types.
Also,
> "" == []
True
> [] == [1]
False
> "" == [1]
*** Type error
where one could have expected False in the last line.
So, I think referential transparency only holds when the exact types are specified to resolve polymorphism. An explicit type parameter application à la System F would make it possible to always substitute a variable with its definition without altering the semantics: as far as I understand, GHC internally does exactly this during optimization to ensure that semantics is unaffected. Indeed, GHC Core has explicit type arguments which are passed around.
The problem is overloading, which does indeed sort of violate referential transparency. You have no idea what something like (+) does in Haskell; it depends on the type.
When a numeric type is unconstrained in a Haskell program the compiler uses type defaulting to pick some suitable type. This is for convenience, and usually doesn't lead to any surprises. But in this case it did lead to a surprise. In ghc you can use -fwarn-type-defaults to see when the compiler has used defaulting to pick a type for you. You can also add the line default () to your module to stop all defaulting.
I thought of something which might help clarify things...
The expression mod (7^7^7) 5 has type Integral a so there are two common ways to convert it to an Int:
Perform all of the arithmetic using Integer operations and types and then convert the result to an Int.
Perform all of the arithmetic using Int operations.
If the expression is used in an Int context Haskell will perform method #2. If you want to force Haskell to use #1 you have to write:
fromInteger (mod (7^7^7) 5)
This will ensure that all of the arithmetic operations will be performed using Integer operations and types.
When you enter he expression at the ghci REPL, defaulting rules typed the expression as an Integer, so method #1 was used. When you use the expression with the !! operator it was typed as an Int, so it was computed via method #2.
My original answer:
In Haskell the evaluation of an expression like
(7^7^7`mod`5`mod`2)
depends entirely on which Integral instance is being used, and this is something that every Haskell programmer learns to accept.
The second thing that every programmer (in any language) has to be aware of is that numeric operations are subject to overflow, underflow, loss of precision, etc. and thereby the laws for arithmetic may not always hold. For instance, x+1 > x is not always true; addition and multiple of real numbers is not always associative; the distributive law does not always hold; etc. When you create an overflowing expression you enter the realm of undefined behavior.
Also, in this particular case there are better ways to go about evaluating this expression which preserves more of our expectation of what the result should be. In particular, if you want to efficiently and accurately compute a^b mod c you should be using the "power mod" algorithm.
Update: Run the following program to see how the choice of Integral instance affects the what an expression evaluates to:
import Data.Int
import Data.Word
import Data.LargeWord -- cabal install largeword
expr :: Integral a => a
expr = (7^e `mod` 5)
where e = 823543 :: Int
main :: IO ()
main = do
putStrLn $ "as an Integer: " ++ show (expr :: Integer)
putStrLn $ "as an Int64: " ++ show (expr :: Int64)
putStrLn $ "as an Int: " ++ show (expr :: Int)
putStrLn $ "as an Int32: " ++ show (expr :: Int32)
putStrLn $ "as an Int16: " ++ show (expr :: Int16)
putStrLn $ "as a Word8: " ++ show (expr :: Word8)
putStrLn $ "as a Word16: " ++ show (expr :: Word16)
putStrLn $ "as a Word32: " ++ show (expr :: Word32)
putStrLn $ "as a Word128: " ++ show (expr :: Word128)
putStrLn $ "as a Word192: " ++ show (expr :: Word192)
putStrLn $ "as a Word224: " ++ show (expr :: Word224)
putStrLn $ "as a Word256: " ++ show (expr :: Word256)
and the output (compiled with GHC 7.8.3 (64-bit):
as an Integer: 3
as an Int64: 2
as an Int: 2
as an Int32: 3
as an Int16: 3
as a Word8: 4
as a Word16: 3
as a Word32: 3
as a Word128: 4
as a Word192: 0
as a Word224: 2
as a Word256: 1
What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.
Haskell does not provide a precise promise or guarantee. There exist many functions like unsafePerformIO or traceShow which are not referentially transparent. The extension called Safe Haskell however provides the following promise:
Referential transparency — Functions in the safe language are deterministic, evaluating them will not cause any side effects. Functions in the IO monad are still allowed and behave as usual. Any pure function though, as according to its type, is guaranteed to indeed be pure. This property allows a user of the safe language to trust the types. This means, for example, that the unsafePerformIO :: IO a -> a function is disallowed in the safe language.
Haskell provides an informal promise outside of this: the Prelude and base libraries tend to be free of side effects and Haskell programmers tend to label things with side effects as such.
Evidently, this expression is now referentially opaque. How can I tell whether or not a program is subject to such behavior? I can inundate the program with :: all over but that does not make it very readable. Is there any other class of Haskell programs in between that I miss? That is between a fully annotated and an unannotated one?
As others have said, the problem emerges from this behavior:
Prelude> ( (7^7^7`mod`5`mod`2)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
(True,False)
Prelude> 7^7^7`mod`5`mod`2 :: Integer
1
Prelude> 7^7^7`mod`5`mod`2 :: Int
0
This happens because 7^7^7 is a huge number (about 700,000 decimal digits) which easily overflows a 64-bit Int type, but the problem will not be reproducible on 32-bit systems:
Prelude> :m + Data.Int
Prelude Data.Int> 7^7^7 :: Int64
-3568518334133427593
Prelude Data.Int> 7^7^7 :: Int32
1602364023
Prelude Data.Int> 7^7^7 :: Int16
8823
If using rem (7^7^7) 5 the remainder for Int64 will be reported as -3 but since -3 is equivalent to +2 modulo 5, mod reports +2.
The Integer answer is used on the left due to the defaulting rules for Integral classes; the platform-specific Int type is used on the right due to the type of (!!) :: [a] -> Int -> a. If you use the appropriate indexing operator for Integral a you instead get something consistent:
Prelude> :m + Data.List
Prelude Data.List> ((7^7^7`mod`5`mod`2) == 1, genericIndex [False,True] (7^7^7`mod`5`mod`2))
(True,True)
The problem here is not referential transparency because the functions that we're calling ^ are actually two different functions (as they have different types). What has tripped you up is typeclasses, which are an implementation of constrained ambiguity in Haskell; you have discovered that this ambiguity (unlike unconstrained ambiguity -- i.e. parametric types) can deliver counterintuitive results. This shouldn't be too surprising but it's definitely a little strange at times.
A another type has been chosen, because !! requires an Int. The full computation now uses Int instead of Integer.
λ> ( (7^7^7`mod`5`mod`2 :: Int)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
(False,False)
What you think this has to do with referential transparency? Your uses of 7, ^, mod, 5, 2, and == are applications of those variables to dictionaries, yes, but I don't see why you think that fact makes Haskell referentially opaque. Often applying the same function to different arguments produces different results, after all!
Referential transparency has to do with this expression:
let x :: Int = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)
x is here a single value, and should always have that same single value.
By contrast, if you say:
let x :: forall a. Num a => a; x = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)
(or use the expression inline, which is equivalent), x is now a function, and can return different values depending on the Num argument you supply to it. You might as well complain that let f = (+1) in map f [1, 2, 3] is [2, 3, 4], but let f = (+3) in map f [1, 2, 3] is [4, 5, 6] and then say "Haskell gives different values for map f [1, 2, 3] depending on the context so it's referentially opaque"!
Probably another type-inference and referential-transparency related thing is the „dreaded“ Monomorphism restriction (its absence, to be exact). A direct quote:
An example, from „A History of Haskell“:
Consider the genericLength function, from Data.List
genericLength :: Num a => [b] -> a
And consider the function:
f xs = (len, len)
where
len = genericLength xs
len has type Num a => a and, without the monomorphism restriction, it could be computed twice.
Notice that in this case types of both expressions are the same. Results are too, but the substitution isn't always possible.

Haskell type dessignation

I have to dessignate types of 2 functions(without using compiler :t) i just dont know how soudl i read these functions to make correct steps.
f x = map -1 x
f x = map (-1) x
Well i'm a bit confuse how it will be parsed
Function application, or "the empty space operator" has higher precedence than any operator symbol, so the first line parses as f x = map - (1 x), which will most likely1 be a type error.
The other example is parenthesized the way it looks, but note that (-1) desugars as negate 1. This is an exception from the normal rule, where operator sections like (+1) desugar as (\x -> x + 1), so this will also likely1 be a type error since map expects a function, not a number, as its first argument.
1 I say likely because it is technically possible to provide Num instances for functions which may allow this to type check.
For questions like this, the definitive answer is to check the Haskell Report. The relevant syntax hasn't changed from Haskell 98.
In particular, check the section on "Expressions". That should explain how expressions are parsed, operator precedence, and the like.
These functions do not have types, because they do not type check (you will get ridiculous type class constraints). To figure out why, you need to know that (-1) has type Num n => n, and you need to read up on how a - is interpreted with or without parens before it.
The following function is the "correct" version of your function:
f x = map (subtract 1) x
You should be able to figure out the type of this function, if I say that:
subtract 1 :: Num n => n -> n
map :: (a -> b) -> [a] -> [b]
well i did it by my self :P
(map) - (1 x)
(-)::Num a => a->a->->a
1::Num b=> b
x::e
map::(c->d)->[c]->[d]
map::a
a\(c->d)->[c]->[d]
(1 x)::a
1::e->a
f::(Num ((c->d)->[c]->[d]),Num (e->(c->d)->[c]->[d])) => e->(c->d)->[c]->[d]

Evaluation strategy

How should one reason about function evaluation in examples like the following in Haskell:
let f x = ...
x = ...
in map (g (f x)) xs
In GHC, sometimes (f x) is evaluated only once, and sometimes once for each element in xs, depending on what exactly f and g are. This can be important when f x is an expensive computation. It has just tripped a Haskell beginner I was helping and I didn't know what to tell him other than that it is up to the compiler. Is there a better story?
Update
In the following example (f x) will be evaluated 4 times:
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
With language extensions, we can create situations where f x must be evaluated repeatedly:
{-# LANGUAGE GADTs, Rank2Types #-}
module MultiEvG where
data BI where
B :: (Bounded b, Integral b) => b -> BI
foo :: [BI] -> [Integer]
foo xs = let f :: (Integral c, Bounded c) => c -> c
f x = maxBound - x
g :: (forall a. (Integral a, Bounded a) => a) -> BI -> Integer
g m (B y) = toInteger (m + y)
x :: (Integral i) => i
x = 3
in map (g (f x)) xs
The crux is to have f x polymorphic even as the argument of g, and we must create a situation where the type(s) at which it is needed can't be predicted (my first stab used an Either a b instead of BI, but when optimising, that of course led to only two evaluations of f x at most).
A polymorphic expression must be evaluated at least once for each type it is used at. That's one reason for the monomorphism restriction. However, when the range of types it can be needed at is restricted, it is possible to memoise the values at each type, and in some circumstances GHC does that (needs optimising, and I expect the number of types involved mustn't be too large). Here we confront it with what is basically an inhomogeneous list, so in each invocation of g (f x), it can be needed at an arbitrary type satisfying the constraints, so the computation cannot be lifted outside the map (technically, the compiler could still build a cache of the values at each used type, so it would be evaluated only once per type, but GHC doesn't, in all likelihood it wouldn't be worth the trouble).
Monomorphic expressions need only be evaluated once, they can be shared. Whether they are is up to the implementation; by purity, it doesn't change the semantics of the programme. If the expression is bound to a name, in practice you can rely on it being shared, since it's easy and obviously what the programmer wants. If it isn't bound to a name, it's a question of optimisation. With the bytecode generator or without optimisations, the expression will often be evaluated repeatedly, but with optimisations repeated evaluation would indicate a compiler bug.
Polymorphic expressions must be evaluated at least once for every type they're used at, but with optimisations, when GHC can see that it may be used multiple times at the same type, it will (usually) still be shared for that type during a larger computation.
Bottom line: Always compile with optimisations, help the compiler by binding expressions you want shared to a name, and give monomorphic type signatures where possible.
Your examples are indeed quite different.
In the first example, the argument to map is g (f x) and is passed once to map most likely as partially applied function.
Should g (f x), when applied to an argument within map evaluate its first argument, then this will be done only once and then the thunk (f x) will be updated with the result.
Hence, in your first example, f xwill be evaluated at most 1 time.
Your second example requires a deeper analysis before the compiler can arrive at the conclusion that (f x) is always constant in the lambda expression. Perhaps it will never optimize it at all, because it may have knowledge that trace is not quite kosher. So, this may evaluate 4 times when tracing, and 4 times or 1 time when not tracing.
This is really dependent on GHC's optimizations, as you've been able to tell.
The best thing to do is to study the GHC core that you get after optimizing the program. I would look at the generated Core and examine whether f x had its own let statement outside the map or not.
If you want to be sure, then you should factor f x out into its own variable assigned in a let, but there's not really a guaranteed way to figure it out other than reading through Core.
All that said, with the exception of things like trace that use unsafePerformIO, this will never change the semantics of your program: how it actually behaves.
In GHC without optimizations, the body of a function is evaluated every time the function is called. (A "call" means the function is applied to arguments and the result is evaluated.) In the following example, f x is inside a function, so it will execute each time the function is called.
(GHC may optimize this expression as discussed in the FAQ [1].)
let f x = trace "!" $ zip x x
x = "abc"
in map (\i -> lookup i (f x)) "abcd"
However, if we move f x out of the function, it will execute only once.
let f x = trace "!" $ zip x x
x = "abc"
in map ((\f_x i -> lookup i f_x) (f x)) "abcd"
This can be rewritten more readably as
let f x = trace "!" $ zip x x
x = "abc"
g f_x i = lookup i f_x
in map (g (f x)) "abcd"
The general rule is that, each time a function is applied to an argument, a new "copy" of the function body is created. Function application is the only thing that may cause an expression to re-execute. However, be warned that some functions and function calls do not look like functions syntactically.
[1] http://www.haskell.org/haskellwiki/GHC/FAQ#Subexpression_Elimination

Resources