https://stackoverflow.com/a/15243682/944430
And then there is coding: use unboxed types (no GC), minimize lazy structure allocation. Keep long lived data around in packed form. Test and benchmark.
1.) What are unboxed types? I am pretty sure he is speaking about data types, something like Just x or IO y (boxed). But what about newtypes? If I understood it correctly, newtype has no overhead at all and therefore shouldn't count as a boxed type?
2.) What does he mean by Keep long lived data around in packed form.?
3.) What else can I do to prevent GC pauses?
1 .
Unboxed types are the primitives in Haskell. For example, Int is defined as: data Int = GHC.Types.I# GHC.Prim.Int# (for the GHC compiler). The trailing # symbol is used to indicate primitives (this is only convention). Primitives don't really exist in Haskell. You can't define additional primitives. When they appear in code, the compiler is responsible for translating them to 'real' function calls (functions can be primitives too) and datatypes.
Yes, newtype does not 'box' a type additionally. But you can't have a newtype containing a primitive - newtype Int2 = Int2 Int# is invalid while data Int2 = Int2 Int# is fine.
The main difference between primitive and boxed types in the context of the question you linked is how they are represented in memory. A primitive type means there are no pointers to follow. A pointer to an Int# must point to the value of the number, whereas a pointer to an Int may point to a thunk which points to a thunk ... etc. Note that this means that primitives are always strict. If you believe this will be an issue, use the UNPACK pragma, which removes any 'intermediate' boxing. That is,
data D = D (Int, Int)
is stored as a pointer (D) to a pointer (the tuple) to a block of memory containing two pointers (Ints) which each point to an actual Int#. However,
data D = D {-# UNPACK #-} !(Int, Int)
is stored as a pointer (D) to two Ints, thereby removing one level of boxing. Note the !. This indicates that that field is strict and is required for UNPACK.
2 . Any data which is going to be called with polymorphic functions should be kept packed, as unpacked data passed to polymorphic functions will be repacked anyways (introducing unnecessary overhead). The reasoning behind keeping long-lived data packed is that it is more likely to be used in an intermediate datatype or function which will require repacking, while this is easier to control with short-lived data which is only passed to a few functions before being garbage collected.
3 . In 99% of cases, you won't have issues with garbage collector pauses. In general, there aren't things you can do to guarantee the GC will not pause. The only suggestion I have is, don't try to reinvent the wheel. There are libraries designed for high-performance computations with large amounts of data (repa, vector, etc). If you try to implement it yourself, chances are, they did it better!
If you define data Int2 = Int, You could think of Int# being unboxed, plain Int being boxed and Int2 as "double boxed". Is you used newtype instead of data, it would have avoided one indirection. But Int itself is still boxed. Therefore Int2 is boxed too.
As for packed form, without going into to much details, it intuitively is similar to this kind of C-code.
struct PackedCoordinate {
int x;
int y;
}
struct UnpackedCoordinate {
int *x;
int *y;
}
I'm not sure why he suggested long lived data to be in packed form. Anyhow, it seems from the documentation I linked to that one should be careful using the {-# UNPACK #-} pragma, because if you're unlucky, GHC might need to repack it's values before function calls, making it allocating more memory than it would if it wasn't unpacked to begin with.
In order to avoid garbage collections. I think you should approach this as anything else related to profiling: Find the bottle-neck in your program and then work from there.
PS. Please comment on anything I happen to be incorrect about. :)
Related
I wonder what is the memory footprint of a variable from type IORef a if I know that the size of a is x.
Also what is the expected performance of the function writeIORef applied to integer compare to say a regular variable assignment (like x = 3) in, say, Java ?
In Haskell, an IORef a behaves like a single-element mutable array. The definition of IORef is the following, disregarding newtype wrapping:
data IORef a = IORef (MutVar# RealWorld a)
Here, MutVar# RealWorld a is a primitive mutable reference type. It is a pointer which points to two words, a header, and a payload which is itself a pointer to a normal lifted Haskell object. Hence the overhead of MutVar is two words (16 byte on 64 bit systems) and one indirection.
The overhead of MutVar# is thus one extra indirection and one extra header word. This is unavoidable. In contrast, the overhead of the IORef constructor is also one header word and one indirection, but it can be eliminated by unpacking IORef:
data Foo a = Foo !(IORef a) a a
Here, the bang on IORef causes the underlying MutVar to be unpacked into Foo. But while this unpacking works whenever we define new data types, it does not work if we use any existing parameterized type, like lists. In [IORef a], we pay the full cost with two extra indirections.
IORef will be also generally unpacked by GHC optimization if it is used as an argument to a function: IORef a -> b will be generally unboxed to MutVar# RealWorld a -> b, if you compile with optimization.
However, all of the above overheads are less important than the overhead in garbage collection when you use a large number of IORef-s. To avoid that, it is advisable to use a single mutable array instead of many IORef-s.
What is the relation between, e.g., CInt vs Int# vs. CInt#?
For example, if I call a foreign function which returns a CInt, isn't it already a CInt# by construction (i.e., it's a raw int on the stack, not a pointer to something on the heap which contains an int)?
And in that case, what would be the difference between CInt and Int#?
If I'm trying to eke out every bit of performance that I can, which one to use out of CInt and Int# and CInt#?
I don't think there such a thing as CInt#.
CInt is just a custom type which is guaranteed to play well with C (see the blurb at the top of this page for a more formal take on that). It is boxed, so you take a performance hit for that.
Int# is a "magic" unboxed int. As it turns out, it does play well with the FFi, so use that if you want every bit of performance.
I've recently been looking around at various Haskell quirks, like unboxed types and whatnot, when I discovered the Addr# type.
The GHC.Prim package describes it as so:
An arbitrary machine address assumed to point outside the garbage-collected heap.
And that means not much to me.
Furthermore, I keep finding functions like this that use the type:
readIntOffAddr# :: Addr# -> Int# -> State# s -> (#State# s, Int##)
What is this type? What can I do with it? Why is it necessary?
As a complement to Michael's answer:
Addr# is the unboxed type underlying a Ptr a in the same way that Int# is the unboxed type underlying an Int. Its contents are presumably to be interpreted as a machine address, though as far as the compiler and GC are concerned it is just another integral type (of whatever size pointers are on the system in question). Since it is an arbitrary machine address and not a GC-managed pointer, it should presumably not point into the Haskell heap because the addresses of Haskell heap objects are not stable as viewed from the level of Haskell (a GC could occur at any point in your program, and then whatever object your Addr# pointed at now lives elsewhere, or nowhere at all).
Normally a Ptr a/Addr# will contain a pointer returned from malloc/mmap/etc., or a pointer to a C global variable, or in general any sort of thing that a pointer might sensibly point to in a C program. You would normally use readIntOffAddr# when interfacing with a C function that returns or modifies the contents of a passed HsInt *. (Well, you wouldn't use it directly, you'd use Int's peekElemOff Storable method which I presume is implemented in terms of readIntOffAddr#, or you would use an even higher-level function like peekArray).
The equivalent* C code would be:
long readIntOffAddr(long *ptr, long offset) {
return ptr[offset];
}
Addr# is just like void *. The function has an IO-like signature because it is not "pure". Multiple calls to the function may return different values (obviously).
* Update (2018): I just learned that equating C's int type with Haskells Int# type is wrong. So I changed int to long in the above code snippet. This is also (maybe) not 100% correct, but at least it is true for all GHC implementations that I have seen. In GHC versions 6-8 (haven't checked others), Int# is 32 bit wide on 32-bit platforms and 64 bit wide on 64-bit platforms. This matches the behaviour of GCC for long for all C/C++ implementations on 32-bit and 64-bit platforms that I am aware of, so I think equating Int# with long is a good first approximation. No one noticed this minor inaccuracy in the last 3 years (or cared enough to edit/comment). I doubt that there is any Haskell/Platform/C combination where HsInt != long where the Haskell implementation has a readIntOffAddr# function.. please prove me wrong.
GHC Haskell exposes the prim package, which contains definitions of unboxed values, such as Int#, Char#, etc.
How do they differ from the default Int, Char, etc., types in regular Haskell? An assumption would be that they're faster, but why?
When should one reach down to use these instead of the boxed regular alternatives?
How does using boxed vs unboxed value affect the program?
In simple terms, a value of type Int may be an unevaluated expression. The actual value isn't calculated until you "look at" the value.
A value of type Int# is an evaluated result. Always.
As a result of this, a Int a data structure that lives on the heap. An Int# is... just a 32-bit integer. It can live in a CPU register. You can operate on it with a single machine instruction. It has almost no overhead.
By contrast, when you write, say, x + 1, you're not actually computing x + 1, you're creating a data structure on the heap that says "when you want to compute this, do x + 1".
Put simply, Int# is faster, because it can't be lazy.
When should you use it? Almost never. That's the compiler's job. The idea being that you write nice high-level Haskell code involving Int, and the compiler figures out where it can replace Int with Int#. (We hope!) If it doesn't, it's almost always easier to throw in a few strictness annotations rather than play with Int# directly. (It's also non-portable; only GHC uses Int# - although currently there aren't really any other widely used Haskell compilers.)
I am starting Haskell and was looking at some libraries where data types are defined with "!". Example from the bytestring library:
data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8) -- payload
{-# UNPACK #-} !Int -- offset
{-# UNPACK #-} !Int -- length
Now I saw this question as an explanation of what this means and I guess it is fairly easy to understand. But my question is now: what is the point of using this? Since the expression will be evaluated whenever it is need, why would you force the early evaluation?
In the second answer to this question C.V. Hansen says: "[...] sometimes the overhead of lazyness can be too much or wasteful". Is that supposed to mean that it is used to save memory (saving the value is cheaper than saving the expression)?
An explanation and an example would be great!
Thanks!
[EDIT] I think I should have chosen an example without {-# UNPACK #-}. So let me make one myself. Would this ever make sense? Is yes, why and in what situation?
data MyType = Const1 !Int
| Const2 !Double
| Const3 !SomeOtherDataTypeMaybeMoreComplex
The goal here is not strictness so much as packing these elements into the data structure. Without strictness, any of those three constructor arguments could point either to a heap-allocated value structure or a heap-allocated delayed evaluation thunk. With strictness, it could only point to a heap-allocated value structure. With strictness and packed structures, it's possible to make those values inline.
Since each of those three values is a pointer-sized entity and is accessed strictly anyway, forcing a strict and packed structure saves pointer indirections when using this structure.
In the more general case, a strictness annotation can help reduce space leaks. Consider a case like this:
data Foo = Foo Int
makeFoo :: ReallyBigDataStructure -> Foo
makeFoo x = Foo (computeSomething x)
Without the strictness annotation, if you just call makeFoo, it will build a Foo pointing to a thunk pointing to the ReallyBigDataStructure, keeping it around in memory until something forces the thunk to evaluate. If we instead have
data Foo = Foo !Int
This forces the computeSomething evaluation to proceed immediately (well, as soon as something forces makeFoo itself), which avoids leaving a reference to the ReallyBigDataStructure.
Note that this is a different use case than the bytestring code; the bytestring code forces its parameters quite frequently so it's unlikely to lead to a space leak. It's probably best to interpret the bytestring code as a pure optimization to avoid pointer dereferences.