Python-"is"-like equality operator for Haskell/GHC - haskell

Is there a GHC-specific "unsafe" extension to ask whether two Haskell references point to the same location?
I'm aware this can break referential transparency if not used properly. But there should be little harm (unless I'm missing something), if it is used very careful, as a means for optimizations by short-cutting recursive (or expensive) data traversal, e.g. for implementing an optimized Eq instance, e.g.:
instance Eq ComplexTree where
a == b = (a `unsafeSameRef` b) || (a `deepCompare` b)
providing deepCompare is guaranteed to be true if unsafeSameRef decides true (but not necessarily the other way around).
EDIT/PS: Thanks to the answer pointing to System.Mem.StableName, I was able to also find the paper Stretching the storage manager: weak pointers and stable names in Haskell which happens to have addressed this very problem already over 10 years ago...

GHC's System.Mem.StableName solves exactly this problem.

There's a pitfall to be aware of:
Pointer equality can change strictness. I.e., you might get pointer equality saying True when in fact the real equality test would have looped because of, e.g., a circular structure. So pointer equality ruins the semantics (but you knew that).

I think StablePointers might be of help here
http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Foreign-StablePtr.html
Perhaps this is the kind of solution you are looking for:
import Foreign.StablePtr (newStablePtr, freeStablePtr)
import System.IO.Unsafe (unsafePerformIO)
unsafeSameRef :: a -> a -> Bool
unsafeSameRef x y = unsafePerformIO $ do
a <- newStablePtr x
b <- newStablePtr y
let z = a == b
freeStablePtr a
freeStablePtr b
return z;

There's unpackClosure# in GHC.Prim, with the following type:
unpackClosure# :: a -> (# Addr#,Array# b,ByteArray# #)
Using that you could whip up something like:
{-# LANGUAGE MagicHash, UnboxedTuples #-}
import GHC.Prim
eq a b = case unpackClosure# a of
(# a1,a2,a3 #) -> case unpackClosure# b of
(# b1,b2,b3 #) -> eqAddr# a1 b1
And in the same package, there's the interestingly named reallyUnsafePtrEquality# of type
reallyUnsafePtrEquality# :: a -> a -> Int#
But I'm not sure what the return value of that one is - going by the name it will lead to much gnashing of teeth.

Related

Writing modules in Haskell the right way

(I'm totally rewriting this question to give it a better focus; you can see the history of changes if you want to see the original.)
Let's say I have two modules:
One module defines the function inverseAndSqrt. What this function actually does is not important; what is important is that it returns none, one, or both of two things in a way that the client can distinguish which one is which;
module Module1 (inverseAndSqrt) where
type TwoOpts a = (Maybe a, Maybe a)
inverseAndSqrt :: Int -> TwoOpts Float
inverseAndSqrt x = (if x /= 0 then Just (1.0/(fromIntegral x)) else Nothing,
if x >= 0 then Just (sqrt $ fromIntegral x) else Nothing)
another module defines other functions depending on inverseAndSqrt and on its type
module Module2 where
import Module1
fun :: (Maybe Float, Maybe Float) -> Float
fun (Just x, Just y) = x + y
fun (Just x, Nothing) = x
fun (Nothing, Just y) = y
exportedFun :: Int -> Float
exportedFun = fun . inverseAndSqrt
What I want to understand from the perspective of design principle is: how should I interface Module1 with other modules (e.g. Module2) in a way that makes it well encapsulated, reusable, etc?
The problems I see are
I could one day decide that I don't want to use a pair to return the two results anymore; I could decide to use a 2 elements list; or another type which is isomorphic (I think this is the right adjective, isn't it?) to a pair; if I do this, all client code will break
Exporting the TwoOpts type synonym doesn't solve anything, as Module1 could still change its implementation thus breaking client code.
Module1 is also forcing the type of the two optionals to be the same, but I'm not sure this is really relevant to this question...
How should I design Module1 (and thus edit Module2 as well) such that the two are not tightly coupled?
One thing I can think of is that maybe I should define a typeclass expressing what "a box with two optional things in it" is, and then Module1 and Module2 would use that as a common interface. But should that be in both module? In either of them? Or in none of them, in a third module? Or maybe such a class/concept is not needed?
I'm not a computer scientist so I'm sure that this question highlights some misunderstanding of mine due to lack of experience and theoretical background. Any help filling the gaps is welcome.
Possible modifications I'd like to support
Related to what chepner suggested in a comment to his answer, at some point I might want to extend the support from 2-tuple things to both 2- and 3-tuple things, having different accessor names for them, suche as get1of2/get2of2 (let's say these are the name we use when we first design Module1) vs get1of3/get2of3/get3of3.
At some point I would also be able to complement this 2-tuple-like type with something else, for instance an optional containing Just the sum¹ of the two main contents only if they are both Justs, or a Nothing if at least one of the two main contents is a Nothing. I guess in this case the internal representation of this class would be something like ((Maybe a, Maybe a), Maybe b) (¹ The sum is really a stupid example, so I've used b here instead of a to be more general than the sum would require).
To me, Haskell design is all type-centric. The design rule for functions is just "use the most general and accurate types that do the job", and the whole problem of design in Haskell is about coming up with the best types for the job.
We would like there to be no "junk" in the types, so that they have exactly one representation for each value you want to denote. E.g. String is a bad representation for numbers, because "0", "0.0", "-0" all mean the same thing, and also because "The Prisoner" is not a number -- it is a valid representation that does not have a valid denotation. If, say for performance reasons, the same denotation can be represented multiple ways, the type's API should make that difference invisible to the user.
So in your case, (Maybe a, Maybe a) is perfect -- it means exactly what you need it to mean. Using something more complicated is unnecessary, and will just complicate matters for the user. At some point whatever you expose will have to be convertible to a Maybe a for the first thing and a Maybe a for the second thing, and there is no extra information than that, so the tuple is perfect. Whether you use a type synonym or not is a matter of style -- I prefer not use synonyms at all and only give types names when I have a more formal abstraction in mind.
Connotation is important. For example, if I had a function for finding the roots of a quadratic polynomial, I probably wouldn't use TwoOpts, even though there are at most two of them. The fact that my return values are all "the same kind of thing" in an intuitive sense makes me prefer a list (or if I'm feeling particularly picky, a Set or Bag), even if the list has at most two elements. I just have it match my best understanding of the domain at the time, so I won't change it unless my understanding of the domain has changed in a significant way, in which case the opportunity to review all its uses is exactly what I want. If you are writing your functions to be as polymorphic as possible, then often you won't need to change anything but the specific moments the meaning is used, the exact moment domain knowledge is required (such as understanding the relationship between TwoOpts and Set). You don't need to "redo the plumbing" if it's made of a sufficiently flexible, polymorphic material.
Supposing you didn't have a clean isomorphism to a standard type like (Maybe a, Maybe a), and you wanted to formalize TwoOpts. The way here is to build an API out of its constructors, combinators, and eliminators. For example:
data TwoOpts a -- abstract, not exposed
-- constructors
none :: TwoOpts a
justLeft :: a -> TwoOpts a
justRight :: a -> TwoOpts a
both :: a -> a -> TwoOpts a
-- combinators
-- Semigroup and Monoid at least
swap :: TwoOpts a -> TwoOpts a
-- eliminators
getLeft :: TwoOpts a -> Maybe a
getRight :: TwoOpts a -> Maybe a
In this case the eliminators give exactly your representation (Maybe a, Maybe a) as their final coalgebra.
-- same as the tuple in a newtype, just more conventional
data TwoOpts a = TwoOpts (Maybe a) (Maybe a)
Or if you wanted to focus on the constructors side you could use an initial algebra
data TwoOpts a
= None
| JustLeft a
| JustRight a
| Both a a
You are at liberty to change this representation as long as it still implements the combinatory API above. If you have reason to use different representations of the same API, make the API into a typeclass (typeclass design is a whole other story).
In Einstein's famous words, "make it as simple as possible, but no simpler".
Don't define a simple type alias; this exposes the details of how you implement TwoOpts.
Instead, define a new type, but don't export the data constructor, but rather functions for accessing the two components. Then you are free to change the implementation of the type all you like without changing the interface, because the user can't pattern-match on a value of type TwoOpts a.
module Module1 (TwoOpts, inverseAndSqrt, getFirstOpt, getSecondOpt) where
data TwoOpts a = TwoOpts (Maybe a) (Maybe a)
getFirstOpt, getSecondOpt :: TwoOpts a -> Maybe a
getFirstOpt (TwoOpts a _) = a
getSecondOpt (TwoOpts _ b) = b
inverseAndSqrt :: Int -> TwoOpts Float
inverseAndSqrt x = TwoOpts (safeInverse x) (safeSqrt x)
where safeInverse 0 = Nothing
safeInverse x = Just (1.0 / fromIntegral x)
safeSqrt x | x >= 0 = Just $ sqrt $ fromIntegral x
| otherwise = Nothing
and
module Module2 where
import Module1
fun :: TwoOpts Float -> Float
fun a = case (getFirstOpts a, getSecondOpt a) of
(Just x, Just y) -> x + y
(Just x, Nothing) -> x
(Nothing, Just y) -> y
exportedFun :: Int -> Float
exportedFun = fun . inverseAndSqrt
Later, when you realize that you've reimplemented the type product, you can change your definitions without affecting any user code.
newtype TwoOpts a = TwoOpts { getOpts :: (Maybe a, Maybe a) }
getFirstOpt, getSecondOpt :: TwoOpts a -> Maybe a
getFirstOpt = fst . getOpts
getSecondOpt = snd . getOpts

Can Haskell inline functions passed as an argument?

Let's say I pass a small function f to map. Can Haskell inline f with map to produce a small imperative loop? If so, how does Haskell keep track of what function f really is? Can the same be done with Arrow combinators?
If f is passed in as an argument, then no, probably not. If f is the name of a top-level function or a local function, then probably yes.
foobar f = ... map f ...
-- Probably not inlined.
foobar = ... map (\ x -> ...) ...
-- Probably inlined.
That said, I gather that most of the performance difference between inline and out of line comes not from the actual inlining itself, but rather from any additional subsequent optimisations this might allow.
The only way to be "sure" about these things is to actually write the code, actually compile it, and have a look at the Core that gets generated. And the only way to know if it makes a difference (positive or negative) is to actually benchmark the thing.
The definition of the Haskell language does not mandate a Haskell implementation to inline code, or to perform any kind of optimization. Any implementation is free to apply any optimization it may deem appropriate.
That being said, Haskell is nowadays often run using GHC, which does optimize Haskell code. For inlining, GHC uses some heuristics to decide whether something should inlined or not. The general advice is to turn optimization on with -O2 and check the output of the compiler. You can see the produced Core with -ddump-simpl, or the assembly code with -ddump-asm. Some other flags can be useful as well.
If you then see that GHC is not inlining stuff you would like to, you can provide a hint to the compiler with {-# INLINE foo #-} and related pragmas.
Be wary of mindlessly applying optimizations, though. Often, programmers spend their time to optimize parts of the program which have a negligible impact to the overall performance. To avoid this, it is strongly recommended to profile your code first, so that you know where your program spends a lot of time.
Here is an example where GHC does inline a function passed as an argument :
import qualified Data.Vector.Unboxed as U
import qualified Data.Vector as V
plus :: Int -> Int -> Int
plus = (+)
sumVect :: V.Vector Int -> Int
sumVect = V.foldl1 plus
plus is passed as the argument of foldl1, which results in summing a vector of integers. In the Core, plus is inlined and optimized to the unboxed GHC.Prim.+# :: Int# -> Int# -> Int# :
letrec {
$s$wfoldlM_loop_s759
:: GHC.Prim.Int# -> GHC.Prim.Int# -> GHC.Prim.Int#
$s$wfoldlM_loop_s759 =
\ (sc_s758 :: GHC.Prim.Int#) (sc1_s757 :: GHC.Prim.Int#) ->
case GHC.Prim.tagToEnum# # Bool (GHC.Prim.>=# sc_s758 ww1_s748)
of _ {
False ->
case GHC.Prim.indexArray#
# Int ww2_s749 (GHC.Prim.+# ww_s747 sc_s758)
of _ { (# ipv1_X72o #) ->
case ipv1_X72o of _ { GHC.Types.I# y_a5Kg ->
$s$wfoldlM_loop_s759
(GHC.Prim.+# sc_s758 1#) (GHC.Prim.+# sc1_s757 y_a5Kg)
}
};
True -> sc1_s757
}; }
That's the GHC.Prim.+# sc1_s757 y_a5Kg. You can add simple artihmetic inside function plus and see this Core expression expand.

Use of 'unsafeCoerce'

In Haskell, there is a function called unsafeCoerce, that turns anything into any other type of thing. What exactly is this used for? Like, why we would you want to transform things into each other in such an "unsafe" way?
Provide an example of a way that unsafeCoerce is actually used. A link to Hackage would help. Example code in someones question would not.
unsafeCoerce lets you convince the type system of whatever property you like. It's thus only "safe" exactly when you can be completely certain that the property you're declaring is true. So, for instance:
unsafeCoerce True :: Int
is a violation and can lead to wonky, bad runtime behavior.
unsafeCoerce (3 :: Int) :: Int
is (obviously) fine and will not lead to runtime misbehavior.
So what's a non-trivial use of unsafeCoerce? Let's say we've got an typeclass-bound existential type
module MyClass ( SomethingMyClass (..), intSomething ) where
class MyClass x where {}
instance MyClass Int where {}
data SomethingMyClass = forall a. MyClass a => SomethingMyClass a
Let's also say, as noted here, that the typeclass MyClass is not exported and thus nobody else can ever create instances of it. Indeed, Int is the only thing that instantiates it and the only thing that ever will.
Now when we pattern match to destruct a value of SomethingMyClass we'll be able to pull a "something" out from inside
foo :: SomethingMyClass -> ...
foo (SomethingMyClass a) =
-- here we have a value `a` with type `exists a . MyClass a => a`
--
-- this is totally useless since `MyClass` doesn't even have any
-- methods for us to use!
...
Now, at this point, as the comment suggests, the value we've pulled out has no type information—it's been "forgotten" by the existential context. It could be absolutely anything which instantiates MyClass.
Of course, in this very particular situation we know that the only thing implementing MyClass is Int. So our value a must actually have type Int. We could never convince the typechecker that this is true, but due to an outside proof we know that it is.
Therefore, we can (very carefully)
intSomething :: SomethingMyClass -> Int
intSomething (SomethingMyClass a) = unsafeCoerce a -- shudder!
Now, hopefully I've suggested that this is a terrible, dangerous idea, but it also may give a taste of what kind of information we can take advantage of in order to know things that the typechecker cannot.
In non-pathological situations, this is rare. Even rarer is a situation where using something we know and the typechecker doesn't isn't itself pathological. In the above example, we must be completely certain that nobody ever extends our MyClass module to instantiate more types to MyClass otherwise our use of unsafeCoerce becomes instantly unsafe.
> instance MyClass Bool where {}
> intSomething (SomethingMyClass True)
6917529027658597398
Looks like our compiler internals are leaking!
A more common example where this sort of behavior might be valuable is when using newtype wrappers. It's a fairly common idea that we might wrap a type in a newtype wrapper in order to specialize its instance definitions.
For example, Int does not have a Monoid definition because there are two natural monoids over Ints: sums and products. Instead, we use newtype wrappers to be more explicit.
newtype Sum a = Sum { getSum :: a }
instance Num a => Monoid (Sum a) where
mempty = Sum 0
mappend (Sum a) (Sum b) = Sum (a+b)
Now, normally the compiler is pretty smart and recognizes that it can eliminate all of those Sum constructors in order to produce more efficient code. Sadly, there are times when it cannot, especially in highly polymorphic situations.
If you (a) know that some type a is actually just a newtype-wrapped b and (b) know that the compiler is incapable of deducing this itself, then you might want to do
unsafeCoerce (x :: a) :: b
for a slight efficiency gain. This, for instance, occurs frequently in lens and is expressed in the Data.Profunctor.Unsafe module of profunctors, a dependency of lens.
But let me again suggest that you really need to know what's going on before using unsafeCoerce like this is anything but highly unsafe.
One final thing to compare is the "typesafe cast" available in Data.Typeable. This function looks a bit like unsafeCoerce, but with much more ceremony.
unsafeCoerce :: a -> b
cast :: (Typeable a, Typeable b) => a -> Maybe b
Which, you might think of as being implemented using unsafeCoerce and a function typeOf :: Typeable a => a -> TypeRep where TypeRep are unforgeable, runtime tokens which reflect the type of a value. Then we have
cast :: (Typeable a, Typeable b) => a -> Maybe b
cast a = if (typeOf a == typeOf b) then Just b else Nothing
where b = unsafeCoerce a
Thus, cast is able to ensure that the types of a and b really are the same at runtime, and it can decide to return Nothing if they are not. As an example:
{-# LANGUAGE DeriveDataTypeable #-}
{-# LANGUAGE ExistentialQuantification #-}
data A = A deriving (Show, Typeable)
data B = B deriving (Show, Typeable)
data Forget = forall a . Typeable a => Forget a
getAnA :: Forget -> Maybe A
getAnA (Forget something) = cast something
which we can run as follows
> getAnA (Forget A)
Just A
> getAnA (Forget B)
Nothing
So if we compare this usage of cast with unsafeCoerce we see that it can achieve some of the same functionality. In particular, it allows us to rediscover information that may have been forgotten by ExistentialQuantification. However, cast manually checks the types at runtime to ensure that they are truly the same and thus cannot be used unsafely. To do this, it demands that both the source and target types allow for runtime reflection of their types via the Typeable class.
The only time I ever felt compelled to use unsafeCoerce was on finite natural numbers.
{-# LANGUAGE DataKinds, GADTs, TypeFamilies, StandaloneDeriving #-}
data Nat = Z | S Nat deriving (Eq, Show)
data Fin (n :: Nat) :: * where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
deriving instance Show (Fin n)
Fin n is a singly linked data structure that is statically ensured to be smaller than the n type level natural number by which it is parametrized.
-- OK, 1 < 2
validFin :: Fin (S (S Z))
validFin = FS FZ
-- type error, 2 < 2 is false
invalidFin :: Fin (S (S Z))
invalidFin = FS (FS FZ)
Fin can be used to safely index into various data structures. It's pretty standard in dependently typed languages, though not in Haskell.
Sometimes we want to convert a value of Fin n to Fin m where m is greater than n.
relaxFin :: Fin n -> Fin (S n)
relaxFin FZ = FZ
relaxFin (FS n) = FS (relaxFin n)
relaxFin is a no-op by definition, but traversing the value is still required for the types to check out. So we might just use unsafeCoerce instead of relaxFin. More pronounced gains in speed can result from coercing larger data structures that contain Fin-s (for example, you could have lambda terms with Fin-s as bound variables).
This is an admittedly exotic example, but I find it interesting in the sense that it's pretty safe: I can't really think of ways for external libraries or safe user code to mess this up. I might be wrong though and I'd be eager to hear about potential safety issues.
There is no use of unsafeCoerce I can really recommend, but I can see that in some cases such a thing might be useful.
The first use that springs to mind is the implementation of the Typeable-related routines. In particular cast :: (Typeable a, Typeable b) => a -> Maybe b achieves a type-safe behaviour, so it is safe to use, yet it has to play dirty tricks in its implementation.
Maybe unsafeCoerce can find some use when importing FFI subroutines to force types to match. After all, FFI already allows to import impure C functions as pure ones, so it is intrinsecally usafe. Note that "unsafe" does not mean impossible to use, but just "putting the burden of proof on the programmer".
Finally, pretend that sortBy did not exist. Consider then this example:
-- Like Int, but using the opposite ordering
newtype Rev = Rev { unRev :: Int }
instance Ord Rev where compare (Rev x) (Rev y) = compare y x
sortDescending :: [Int] -> [Int]
sortDescending = map unRev . sort . map Rev
The code above works, but feels silly IMHO. We perform two maps using functions such as Rev,unRev which we know to be no-ops at runtime. So we just scan the list twice for no reason, but that of convincing the compiler to use the right Ord instance.
The performance impact of these maps should be small since we also sort the list. Yet it is tempting to rewrite map Rev as unsafeCoerce :: [Int]->[Rev] and save some time.
Note that having a coercing function
castNewtype :: IsNewtype t1 t2 => f t2 -> f t1
where the constraint means that t1 is a newtype for t2 would help, but it would be quite dangerous. Consider
castNewtype :: Data.Set Int -> Data.Set Rev
The above would cause the data structure invariant to break, since we are changing the ordering underneath! Since Data.Set is implemented as a binary search tree, it would cause quite a large damage.

How efficient is the derived Eq instance in GHC?

Is there a short circuit built in to GHC's (and Haskell's in general) derived Eq instance that will fire when I compare the same instance of a data type?
-- will this fire?
let same = complex == complex
My plan is to read in a lazy datastructure (let's say a tree), change some values and then compare the old and the new version to create a diff that will then be written back to the file.
If there would be a short circuit built in then the compare step would break as soon as it finds that the new structure is referencing old values. At the same time this wouldn't read in more than necessary from the file in the first place.
I know I'm not supposed to worry about references in Haskell but this seems to be a nice way to handle lazy file changes. If there is no shortcircuit builtin, would there be a way to implement this? Suggestions on different schemes welcome.
StableNames are specifically designed to solve problems like yours.
Note that StableNames can only be created in the IO monad. So you have two choices: either create your objects in the IO monad, or use unsafePerformIO in your (==) implementation (which is more or less fine in this situation).
But I should stress that it is possible to do this in a totally safe way (without unsafe* functions): only creation of stable names should happen in IO; after that, you may compare them in a totally pure way.
E.g.
data SNWrapper a = SNW !a !(StableName a)
snwrap :: a -> IO (SNWrapper a)
snwrap a = SNW a <$> makeStableName a
instance Eq a => Eq (SNWrapper a) where
(SNW a sna) (SNW b snb) = sna == snb || a == b
Notice that if stable name comparison says "no", you still need to perform full value comparison to get a definitive answer.
In my experience that worked pretty well when you have lots of sharing and for some reason are not willing to use other methods to indicate sharing.
(Speaking of other methods, you could, for example, replace the IO monad with a State Integer monad and generate unique integers in that monad as an equivalent of "stable names".)
Another trick is, if you have a recursive data structure, make the recursion go through SNWrapper. E.g. instead of
data Tree a = Bin (Tree a) (Tree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)
use
data Tree a = Bin (WrappedTree a) (WrappedTree a) | Leaf a
type WrappedTree a = SNWrapper (Tree a)
This way, even if short-circuiting doesn't fire at the topmost layer, it might fire somewhere in the middle and still save you some work.
There's no short-circuiting when both arguments of (==) are the same object. The derived Eq instance will do a structural comparison, and in the case of equality, of course needs to traverse the entire structure. You can build in a possible shortcut yourself using
GHC.Prim.reallyUnsafePtrEquality# :: a -> a -> GHC.Prim.Int#
but that will in fact fire only rarely:
Prelude GHC.Base> let x = "foo"
Prelude GHC.Base> I# (reallyUnsafePtrEquality# x x)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# True True)
1
Prelude GHC.Base> I# (reallyUnsafePtrEquality# 3 3)
0
Prelude GHC.Base> I# (reallyUnsafePtrEquality# (3 :: Int) 3)
0
And if you read a structure from file, it will certainly not find it the same object as one that was already in memory.
You can use rewrite rules to avoid the comparison of lexically identical objects
module Equal where
{-# RULES
"==/same" forall x. x == x = True
#-}
main :: IO ()
main = let x = [1 :: Int .. 10] in print (x == x)
which leads to
$ ghc -O -ddump-rule-firings Equal.hs
[1 of 1] Compiling Equal ( Equal.hs, Equal.o )
Rule fired: Class op enumFromTo
Rule fired: ==/same
Rule fired: Class op show
the rule firing (note: it didn't fire with let x = "foo", but with user-defined types, it should).

Are newtypes faster than enumerations?

According to this article,
Enumerations don't count as single-constructor types as far as GHC is concerned, so they don't benefit from unpacking when used as strict constructor fields, or strict function arguments. This is a deficiency in GHC, but it can be worked around.
And instead the use of newtypes is recommended. However, I cannot verify this with the following code:
{-# LANGUAGE MagicHash,BangPatterns #-}
{-# OPTIONS_GHC -O2 -funbox-strict-fields -rtsopts -fllvm -optlc --x86-asm-syntax=intel #-}
module Main(main,f,g)
where
import GHC.Base
import Criterion.Main
data D = A | B | C
newtype E = E Int deriving(Eq)
f :: D -> Int#
f z | z `seq` False = 3422#
f z = case z of
A -> 1234#
B -> 5678#
C -> 9012#
g :: E -> Int#
g z | z `seq` False = 7432#
g z = case z of
(E 0) -> 2345#
(E 1) -> 6789#
(E 2) -> 3535#
f' x = I# (f x)
g' x = I# (g x)
main :: IO ()
main = defaultMain [ bench "f" (whnf f' A)
, bench "g" (whnf g' (E 0))
]
Looking at the assembly, the tags for each constructor of the enumeration D is actually unpacked and directly hard-coded in the instruction. Furthermore, the function f lacks error-handling code, and more than 10% faster than g. In a more realistic case I have also experienced a slowdown after converting a enumeration to a newtype. Can anyone give me some insight about this? Thanks.
It depends on the use case. For the functions you have, it's expected that the enumeration performs better. Basically, the three constructors of D become Ints resp. Int#s when the strictness analysis allows that, and GHC knows it's statically checked that the argument can only have one of the three values 0#, 1#, 2#, so it needs not insert error handling code for f. For E, the static guarantee of only one of three values being possible isn't given, so it needs to add error handling code for g, that slows things down significantly. If you change the definition of g so that the last case becomes
E _ -> 3535#
the difference vanishes completely or almost completely (I get a 1% - 2% better benchmark for f still, but I haven't done enough testing to be sure whether that's a real difference or an artifact of benchmarking).
But this is not the use case the wiki page is talking about. What it's talking about is unpacking the constructors into other constructors when the type is a component of other data, e.g.
data FooD = FD !D !D !D
data FooE = FE !E !E !E
Then, if compiled with -funbox-strict-fields, the three Int#s can be unpacked into the constructor of FooE, so you'd basically get the equivalent of
struct FooE {
long x, y, z;
};
while the fields of FooD have the multi-constructor type D and cannot be unpacked into the constructor FD(1), so that would basically give you
struct FooD {
long *px, *py, *pz;
}
That can obviously have significant impact.
I'm not sure about the case of single-constructor function arguments. That has obvious advantages for types with contained data, like tuples, but I don't see how that would apply to plain enumerations, where you just have a case and splitting off a worker and a wrapper makes no sense (to me).
Anyway, the worker/wrapper transformation isn't so much a single-constructor thing, constructor specialisation can give the same benefit to types with few constructors. (For how many constructors specialisations would be created depends on the value of -fspec-constr-count.)
(1) That might have changed, but I doubt it. I haven't checked it though, so it's possible the page is out of date.
I would guess that GHC has changed quite a bit since that page was last updated in 2008. Also, you're using the LLVM backend, so that's likely to have some effect on performance as well. GHC can (and will, since you've used -O2) strip any error handling code from f, because it knows statically that f is total. The same cannot be said for g. I would guess that it's the LLVM backend that then unpacks the constructor tags in f, because it can easily see that there is nothing else used by the branching condition. I'm not sure of that, though.

Resources