I am starting Haskell and was looking at some libraries where data types are defined with "!". Example from the bytestring library:
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8) -- payload
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- length
Now I saw this question as an explanation of what this means and I guess it is fairly easy to understand. But my question is now: what is the point of using this? Since the expression will be evaluated whenever it is need, why would you force the early evaluation?
In the second answer to this question C.V. Hansen says: "[...] sometimes the overhead of lazyness can be too much or wasteful". Is that supposed to mean that it is used to save memory (saving the value is cheaper than saving the expression)?
An explanation and an example would be great!
Thanks!
[EDIT] I think I should have chosen an example without {-# UNPACK #-}. So let me make one myself. Would this ever make sense? Is yes, why and in what situation?
data MyType = Const1 !Int
| Const2 !Double
| Const3 !SomeOtherDataTypeMaybeMoreComplex
The goal here is not strictness so much as packing these elements into the data structure. Without strictness, any of those three constructor arguments could point either to a heap-allocated value structure or a heap-allocated delayed evaluation thunk. With strictness, it could only point to a heap-allocated value structure. With strictness and packed structures, it's possible to make those values inline.
Since each of those three values is a pointer-sized entity and is accessed strictly anyway, forcing a strict and packed structure saves pointer indirections when using this structure.
In the more general case, a strictness annotation can help reduce space leaks. Consider a case like this:
data Foo = Foo Int
makeFoo :: ReallyBigDataStructure -> Foo
makeFoo x = Foo (computeSomething x)
Without the strictness annotation, if you just call makeFoo, it will build a Foo pointing to a thunk pointing to the ReallyBigDataStructure, keeping it around in memory until something forces the thunk to evaluate. If we instead have
data Foo = Foo !Int
This forces the computeSomething evaluation to proceed immediately (well, as soon as something forces makeFoo itself), which avoids leaving a reference to the ReallyBigDataStructure.
Note that this is a different use case than the bytestring code; the bytestring code forces its parameters quite frequently so it's unlikely to lead to a space leak. It's probably best to interpret the bytestring code as a pure optimization to avoid pointer dereferences.
Related
I understand that newtype erases the type constructor at compile time as an optimization, so that newtype Foo = Foo Int results in just an Int. In other words, I am not asking this question. My question is not about what newtype does.
Instead, I'm trying to understand why the compiler can't simply apply this optimization itself when it sees a single-value data constructor. When I use hlint, it's smart enough to tell me that a single-value data constructor should be a newtype. (I never make this mistake, but tried it out to see what would happen. My suspicions were confirmed.)
One objection could be that without newtype, we couldn't use GeneralizedNewTypeDeriving and other such extensions. But that's easily solved. If we say…
data Foo m a b = Foo a (m b) deriving (Functor, Applicative, Monad)
The compiler can just barf and tell us of our folly.
Why do we need newtype when the compiler can always figure it out for itself?
It seems plausible that newtype started out mostly as a programmer-supplied annotation to perform an optimization that compilers were too stupid to figure out on their own, sort of like the register keyword in C.
However, in Haskell, newtype isn't just an advisory annotation for the compiler; it actually has semantic consequences. The types:
newtype Foo = Foo Int
data Bar = Bar Int
declare two non-isomorphic types. Specifically, Foo undefined and undefined :: Foo are equivalent while Bar undefined and undefined :: Bar are not, with the result that:
Foo undefined `seq` "not okay" -- is an exception
Bar undefined `seq` "okay" -- is "okay"
and
case undefined of Foo n -> "okay" -- is okay
case undefined of Bar n -> "not okay" -- is an exception
As others have noted, if you make the data field strict:
data Baz = Baz !Int
and take care to only use irrefutable pattern matches, then Baz acts just like the newtype Foo:
Baz undefined `seq` "not okay" -- exception, like Foo
case undefined of ~(Baz n) -> "okay" -- is "okay", like Foo
In other words, if my grandmother had wheels, she'd be a bike!
So, why can't the compiler simply apply this optimization itself when it sees a single-value data constructor? Well, it can't perform this optimization in general without changing the semantics of a program, so it needs to first prove that the semantics are unchanged if a particular arbitrary, one-constructor, one-field data type is made strict in its field and matched irrefutably instead of strictly. Since this depends on how values of the type are actually used, this can be hard to do for data types exported by a module, especially at function call boundaries, but the existing optimization mechanisms for specialization, inlining, strictness analysis, and unboxing often perform equivalent optimizations in chunks of self-contained code, so you may get the benefits of a newtype even when you use a data type by accident. In general, though, it seems to be too hard a problem for the compiler to solve, so the burden of remembering to newtype things is left on the programmer.
This leads to the obvious question -- why can't we change the semantics so they're equivalent; why are the semantics of newtype and data different in the first place?
Well, the reason for the newtype semantics seems pretty obvious. As a result of the nature of the newtype optimization (erasure of the type and constructor at compile time), it becomes impossible -- or at the very least exceedingly difficulty -- to separately represent Foo undefined and undefined :: Foo at compile time which explains the equivalence of these two values. Consequently, irrefutable matching is an obvious further optimization when there's only one possible constructor and there's no possibility that that constructor isn't present (or at least no possibility of distinguishing between presence and absence of the constructor, because the only case where this could happen is in distinguishing between Foo undefined and undefined :: Foo, which we've already said can't be distinguished in compiled code).
The reason for the semantics of a one-constructor, one-field data type (in the absence of strictness annotations and irrefutable matches) is maybe less obvious. However, these semantics are entirely consistent with data types having constructor and/or field counts other than one, while the newtype semantics would introduce an arbitrary inconsistency between this one special case of a data type and all others.
Because of this historical distinction between data and newtype types, a number of subsequent extensions have treated them differently, further entrenching different semantics. You mention GeneralizedNewTypeDeriving which works on newtypes but not one-constructor, one-field data types. There are further differences in calculation of representational equivalence used for safe coercions (i.e., Data.Coerce) and DerivingVia, the use of existential quantification or more general GADTs, the UNPACK pragma, etc. There are also some differences in the way types are represented in generics, though now that I look at them more carefully, they seem pretty superficial.
Even if newtypes were an unnecessary historical mistake that could have been replaced by special-casing certain data types, it's a little late to put the genie back in the bottle.
Besides, newtypes don't really strike me as unnecessary duplication of an existing facility. To me, data and newtype types are conceptually quite different. A data type is an algebraic, sum-of-products type, and it's just coincidence that a particular special case of algebraic types happens to have one constructor and one field and so ends up being (nearly) isomorphic to the field type. In contrast, a newtype is intended from the start to be an isomorphism of an existing type, basically a type alias with an extra wrapper to distinguish it at the type level and allow us to pass around a separate type constructor, attach instances, and so on.
This is an excellent question. Semantically,
newtype Foo = Foo Int
is identical to
data Foo' = Foo !Int
except that pattern matching on the former is lazy and on the latter is strict. So a compiler certainly could compile them the same, and adjust the compilation of pattern matching to keep the semantics right.
For a type like you've described, that optimization isn't really all that critical in practice, because users can just use newtype and sprinkle in seqs or bang patterns as needed. Where it would get a lot more useful is for existentially quantified types and GADTs. That is, we'd like to get the more compact representation for types like
data Baz a b where
Baz :: !a -> Baz a Bool
data Quux where
Quux :: !a -> Quux
But GHC doesn't currently offer any such optimization, and doing so would be somewhat trickier in these contexts.
Why do we need newtype when the compiler can always figure it out for itself?
It can’t. data and newtype have different semantics: data adds an additional level of indirection, while newtype has exactly the same representation as its wrapped type, and always uses lazy pattern matching, while you choose whether to make data lazy or strict with strictness annotation (! or pragmas like StrictData).
Likewise, a compiler doesn’t always know for certain when data can be replaced with newtype. Strictness analysis allows it to conservatively determine when it may remove unnecessary laziness around things that will always be evaluated; in this case it can effectively remove the data wrapper locally. GHC does something similar when removing extra boxing & unboxing in a chain of operations on a boxed numeric type like Int, so it can do most of the calculations on the more efficient unboxed Int#. But in general (that is, without global optimisation) it can’t know whether some code is relying on that thunk’s being there.
So HLint offers this as a suggestion because usually you don’t need the “extra” wrapper at runtime, but other times it’s essential. The advice is just that: advice.
I noticed this pattern is very common in Haskell libraries:
data Foo = Foo { field :: {-# UNPACK #-} !Sometype }
e.g. UNPACKing a field's type and making it strict.
I understand what's the effect of the pragma and annotation but I don't understand why it is so pervasive: I have been programming in Haskell for 15 years and seldom used strictness annotation, and never UNPACK pragma.
If this idiom is so useful, why not make it less "ugly"?
The pragma may be a bit ugly, but it avoids a lot more ugliness elsewhere. When performance is critical, programmers often need to choose a particular shape for a data constructor. Suppose I have
data Point = Point Int Int
data Segment = Segment Point Point
That makes good logical sense, but it has a bunch of extra indirection: one Segment consists of seven heap objects. If I'm working with a lot of segments, that's pretty bad.
I could squash this flat by hand:
data Segment = Segment Int# Int# Int# Int#
but now I've lost the fact that the numbers represent points, and everything I do with a segment will have to involve rather inconvenient and weird unboxed operations.
Fortunately, there's a better way:
-- The small strict Int fields will be unpacked by default
-- with any reasonably recent GHC version.
data Point = Point !Int !Int
data Segment = Segment {-# UNPACK #-} !Point {-# UNPACK #-} !Point
This still gives me one heap object per segment, but I can use Points and Ints and (generally) rely on the compiler unboxing everything nicely.
In the Data.ByteString.Internal, the ByteString has constructor
PS !!(ForeignPtr Word8) !!Int !!Int
What does these double exclamations mean here? I searched and just got that (!!) can be used to index a list (!!) :: [a] -> Int -> a.
This is not part of the actual Haskell source but an (undocumented) feature of how Haddock renders unboxed data types. See https://mail.haskell.org/pipermail/haskell-cafe/2009-January/054135.html:
2009/1/21 Stephan Friedrichs <...>:
Hi,
using haddock-2.4.1 and this file:
module Test where
data Test
= NonStrict Int
| Strict !Int
| UnpackedStrict {-# UNPACK #-} !Int
The generated documentation looks like this:
data Test
Constructors
NonStrict Int
Strict !Int
UnpackedStrict !!Int
Note the double '!' in the last constructor. This is not intended
behaviour, is it?
This is the way GHC pretty prints unboxed types, so I thought Haddock
should follow the same convention. Hmm, perhaps Haddock should have a
chapter about language extensions in its documentation, with a
reference to the GHC documentation. That way the language used is at
least documented. Not sure if it helps in this case though, since "!!"
is probably not documented there.
Perhaps we should not display unbox annotations at all since they are
an implementation detail, right? We could display one "!" instead,
indicating that the argument is strict.
David
I'm concerned with if and when a polymorphic "global" class value is shared/memoized, particularly across module boundaries. I have read this and this, but they don't quite seem to reflect my situation, and I'm seeing some different behavior from what one might expect from the answers.
Consider a class that exposes a value that can be expensive to compute:
{-# LANGUAGE FlexibleInstances, UndecidableInstances #-}
module A
import Debug.Trace
class Costly a where
costly :: a
instance Num i => Costly i where
-- an expensive (but non-recursive) computation
costly = trace "costly!" $ (repeat 1) !! 10000000
foo :: Int
foo = costly + 1
costlyInt :: Int
costlyInt = costly
And a separate module:
module B
import A
bar :: Int
bar = costly + 2
main = do
print foo
print bar
print costlyInt
print costlyInt
Running main yields two separate evaluations of costly (as indicated by the trace): one for foo, and one for bar. I know that costlyInt just returns the (evaluated) costly from foo, because if I remove print foo from main then the first costlyInt becomes costly. (I can also cause costlyInt to perform a separate evaluation no matter what, by generalizing the type of foo to Num a => a.)
I think I know why this behavior happens: the instance of Costly is effectively a function that takes a Num dictionary and generates a Costly dictionary. So when compiling bar and resolving the reference to costly, ghc generates a fresh Costly dictionary, which has an expensive thunk in it. Question 1: am I correct about this?
There are a few ways to cause just one evaluation of costly, including:
Put everything in one module.
Remove the Num i instance constraint and just define a Costly Int instance.
Unfortunately, the analogs of these solutions are not feasible in my program -- I have several modules that use the class value in its polymorphic form, and only in the top-level source file are concrete types finally used.
There are also changes that don't reduce the number of evaluations, such as:
Using INLINE, INLINABLE, or NOINLINE on the costly definition in the instance. (I didn't expect this to work, but hey, worth a shot.)
Using a SPECIALIZE instance Costly Int pragma in the instance definition.
The latter is surprising to me -- I'd expected it to be essentially equivalent to the second item above that did work. That is, I thought it would generate a special Costly Int dictionary, which all of foo, bar, and costlyInt would share. My question 2: what am I missing here?
My final question: is there any relatively simple and foolproof way to get what I want, i.e., all references to costly of a particular concrete type being shared across modules? From what I've seen so far, I suspect the answer is no, but I'm still holding out hope.
Controlling sharing is tricky in GHC. There are many optimizations that GHC does which can affect sharing (such as inlining, floating things out, etc).
In this case, to answer the question why the SPECIALIZE pragma did not achieve the intended effect, let's look at the Core of the B module, in particular of the bar function:
Rec {
bar_xs
bar_xs = : x1_r3lO bar_xs
end Rec }
bar1 = $w!! bar_xs 10000000
-- ^^^ this repeats the computation. bar_xs is just repeat 1
bar =
case trace $fCostlyi2 bar1 of _ { I# x_aDm -> I# (+# x_aDm 2) }
-- ^^^ this is just the "costly!" string
That didn't work as we wanted. Instead of reusing costly, GHC decided to just inline the costly function.
So we have to prevent GHC from inlining costly, or the computation will be duplicated. How do we do that? You might think adding a {-# NOINLINE costly #-} pragma would be enough, but unfortunately specialization without inlining don't seem to work together well:
A.hs:13:3: Warning:
Ignoring useless SPECIALISE pragma for NOINLINE function: ‘$ccostly’
But there is a trick to convince GHC to do what we want: we can write costly in the following way:
instance Num i => Costly i where
-- an expensive (but non-recursive) computation
costly = memo where
memo :: i
memo = trace "costly!" $ (repeat 1) !! 10000000
{-# NOINLINE memo #-}
{-# SPECIALIZE instance Costly Int #-}
-- (this might require -XScopedTypeVariables)
This allows us to specialize costly, will simultanously avoiding the inlining of our computation.
https://stackoverflow.com/a/15243682/944430
And then there is coding: use unboxed types (no GC), minimize lazy structure allocation. Keep long lived data around in packed form. Test and benchmark.
1.) What are unboxed types? I am pretty sure he is speaking about data types, something like Just x or IO y (boxed). But what about newtypes? If I understood it correctly, newtype has no overhead at all and therefore shouldn't count as a boxed type?
2.) What does he mean by Keep long lived data around in packed form.?
3.) What else can I do to prevent GC pauses?
1 .
Unboxed types are the primitives in Haskell. For example, Int is defined as: data Int = GHC.Types.I# GHC.Prim.Int# (for the GHC compiler). The trailing # symbol is used to indicate primitives (this is only convention). Primitives don't really exist in Haskell. You can't define additional primitives. When they appear in code, the compiler is responsible for translating them to 'real' function calls (functions can be primitives too) and datatypes.
Yes, newtype does not 'box' a type additionally. But you can't have a newtype containing a primitive - newtype Int2 = Int2 Int# is invalid while data Int2 = Int2 Int# is fine.
The main difference between primitive and boxed types in the context of the question you linked is how they are represented in memory. A primitive type means there are no pointers to follow. A pointer to an Int# must point to the value of the number, whereas a pointer to an Int may point to a thunk which points to a thunk ... etc. Note that this means that primitives are always strict. If you believe this will be an issue, use the UNPACK pragma, which removes any 'intermediate' boxing. That is,
data D = D (Int, Int)
is stored as a pointer (D) to a pointer (the tuple) to a block of memory containing two pointers (Ints) which each point to an actual Int#. However,
data D = D {-# UNPACK #-} !(Int, Int)
is stored as a pointer (D) to two Ints, thereby removing one level of boxing. Note the !. This indicates that that field is strict and is required for UNPACK.
2 . Any data which is going to be called with polymorphic functions should be kept packed, as unpacked data passed to polymorphic functions will be repacked anyways (introducing unnecessary overhead). The reasoning behind keeping long-lived data packed is that it is more likely to be used in an intermediate datatype or function which will require repacking, while this is easier to control with short-lived data which is only passed to a few functions before being garbage collected.
3 . In 99% of cases, you won't have issues with garbage collector pauses. In general, there aren't things you can do to guarantee the GC will not pause. The only suggestion I have is, don't try to reinvent the wheel. There are libraries designed for high-performance computations with large amounts of data (repa, vector, etc). If you try to implement it yourself, chances are, they did it better!
If you define data Int2 = Int, You could think of Int# being unboxed, plain Int being boxed and Int2 as "double boxed". Is you used newtype instead of data, it would have avoided one indirection. But Int itself is still boxed. Therefore Int2 is boxed too.
As for packed form, without going into to much details, it intuitively is similar to this kind of C-code.
struct PackedCoordinate {
int x;
int y;
}
struct UnpackedCoordinate {
int *x;
int *y;
}
I'm not sure why he suggested long lived data to be in packed form. Anyhow, it seems from the documentation I linked to that one should be careful using the {-# UNPACK #-} pragma, because if you're unlucky, GHC might need to repack it's values before function calls, making it allocating more memory than it would if it wasn't unpacked to begin with.
In order to avoid garbage collections. I think you should approach this as anything else related to profiling: Find the bottle-neck in your program and then work from there.
PS. Please comment on anything I happen to be incorrect about. :)