Say I have a value type Foo, and a method Bar which accepts a reference to a Foo. Most languages will allow me to allocate a new Foo on the stack, and will automatically box it when I try and pass it in to Bar. However, as far as I am aware, this involves copying the Foo value onto the heap, and then using that reference.
Is it possible for a language to include a way of allocating a garbage collected object on the stack? When the method ends, the runtime could check if the object is still in use, and only then would it need to allocate the object on the heap, and update the references.
I imagine this would improve performance for methods that do not keep the reference, and it would hinder performance for methods that do.
Yes, Graal's partial escape analysis does that. While regular EA can only stack-allocate (more precisely: decompose into fields, put fields onto stack) when the object doesn't escape partial EA can optimistically allocate on the stack and only reify the data into an object on uncommon cases where the object must exist.
Also note that garbage collection is not a binary choice. You can have environments that mix and match garbage-collection, ref-counting, arena or scope-based allocators with automatic deallocation and completely manual management. In such a case stack allocations could also be one of the latter things while some heap would be garbage-collected.
Related
Box<> is explained like this on the Rust Book:
... allow you to store data on the heap rather than the stack. What remains on the stack is the pointer to the heap data.
With a description like that, I would expect the described object to be called Heap<> or somethingHeapsomethingelse (DerefHeap, perhaps?). Instead, we use Box.
Why was the name Box chosen?
First, Heap is a very overloaded term, and importantly a heap is an abstract datastructure often used to implement things like priority queues. Having a type called Heap which is not a heap would be extremely confusing, a good reason to avoid that.
Second, "box" is related to the concept of "boxing" or "boxed" objects, in languages which strongly distinguish between value and reference types e.g. Java or Javascript: https://en.wikipedia.org/wiki/Object_type_(object-oriented_programming), in those a "boxed" type is the heap-allocated version of a value type e.g. int/Integer in java, or number/Number in Javascript.
Rust's Box performs an operation which is similar in spirit. Box also originally had a built-in "lifting" operator called box (it's still an internal operation and was originally planned to be stabilised for placement new), as such "box"/"boxing" makes sense linguistically in a way "heap"/"heaping" really does not (as "heaping" hints at a lot of things being put on a heap).
After reading several wiki's and stackoverflow QA i'm left with the question how to pass mark/pass regions of memory allocated by the GC to the C library. Most of the libraries handling FFI seem to allocate memory first, copy the value into it and wrap it in a Ptr type. The problem is to get a guarantee that the memory won't be moved or deallocated while spending time in the C library.
Suppose i have a value myInput that is of type Text and i want to do zero-copy FFI with C. What are my options?
So far i've found the following:
https://wiki.haskell.org/Foreign_Function_Interface#Pointers_to_Haskell_data
In some cases, you may want to give to the foreign code an opaque reference to a Haskell value that you will retrieve later on. You need to be sure that the value is not collected between the time you give it and the time you retrieve it. Stable pointers have been created exactly to do this. You can wrap a value into a StablePtr and give it to the foreign code (StablePtr is one of the marshallable foreign types).
After talking to some people. They tell me that the StablePtr is not supposed to be dereferenced. And is mainly used for passing void* around. On the other hand the wiki says it's suitable for this purpose and also this answer indicates the same https://stackoverflow.com/a/10900699/1833322
More information on StablePtr:
https://www.well-typed.com/blog/2018/05/ghc-special-gc-objects/
after every GC stable pointers are updated to point to new locations of the Haskell objects that they were pointing to before the GC
https://hackage.haskell.org/package/base-4.11.1.0/docs/Foreign-StablePtr.html#t:StablePtr
A stable pointer is a reference to a Haskell expression that is guaranteed not to be affected by garbage collection, i.e., it will neither be deallocated nor will the value of the stable pointer itself change during garbage collection (ordinary references may be relocated during garbage collection). Consequently, stable pointers can be passed to foreign code, which can treat it as an opaque reference to a Haskell value.
I thought i could just wrap my Text with newStablePtr http://hackage.haskell.org/package/base-4.12.0.0/docs/Foreign-StablePtr.html#v:newStablePtr and be done with it. I don't understand what is now the situation with StablePtr. If i can use it for this purpose or not. And if not what is it actually used for.
Then there is pinned memory https://ghc.haskell.org/trac/ghc/wiki/Commentary/Rts/Storage/GC/Pinned with ByteString
There are also pinned ByteArray http://hackage.haskell.org/package/primitive-0.6.2.0/docs/Data-Primitive-ByteArray.html#g:2
Just the thing with ByteString and ByteArray is that often IO libraries provide their own data structures and data would have to be copied first INTO ByteString/ByteArray which is what i'm trying to avoid in the first place.
Maybe there are some (unsafe ?) casting functions?
It seems to be a simple problem since the GC already seems to be able to mark portions of the memory to "don't move" (pinning). Is there a function i can call on a structure to toggle this flag? Or are there any other suitable functions for this use case?
Summary of question and answers
Objects of a particular type, say
type Foo
a::A
b::B
end
can be stored in either of two ways:
Inlined (aka by value): in this case, the statement "variable foo::Foo is stored at location x" effectively means we have a variable foo.a::A at location x and a variable foo.b::B at location x + sizeof(A) (technically the addresses could be a bit more complicated, but that's irrelevant for our purposes).
Referenced (aka by reference): "foo::Foo is stored at location x" means the location x contains a pointer fooptr::Ptr{Foo} such that there is a variable foo.a::A at location fooptr and foo.b::B at location fooptr + sizeof(A).
Unlike other languages (I'm looking at you, C/C++), Julia decides by itself whether to store variables inlined or referenced, and it does so based on the properties of the type:
mutable types -> referenced,
immutable types -> referenced if at least one of its fields is referenced, inlined otherwise.
There are at least two reasons for this rule:
StefanKarpinski's answer: The garbage collector needs be able to find all pointers to heap-allocated objects on the stack. Currently, Julia ensures this by storing all such pointers on a separate "shadow stack", but if we allowed composite types containing pointers to be placed on the stack then such a neat separation would no longer be possible. Instead, the compiler would need to look for pointers among other variables which poses technical difficulties.
yuyichao's answer: Julia requires the inline/reference decision to be made on a per-type rather than per-object basis, which means a hypothetical type
immutable A
a::A
end
would have to be infinitely big if we insisted on inlining it. So we would either have to forbid such recursive immutable types, or we could at most allow non-recursive immutable types to be inlined.
Original question
My understanding of memory management in Julia is:
mutable types -> heap-allocated,
immutable types and tuples -> stack-allocated unless one of their fields is heap-allocated (i.e. mutable).
I don't quite understand the rationale for this behaviour, however. I've read somewhere that the problem with stack-allocating immutables with pointers to mutables is that then the garbage collector might consider the mutables unreachable and destroy them prematurely. On the other hand, if we place the immutable on the heap then there will still be a pointer to the mutables, so it might seem like we avoided the problem, but actually we just shifted it to making sure that now the immutable itself will not be destroyed.
Can anyone explain this to me who has only very superficial knowledge of how garbage collection works?
The problem with stack-allocation of objects which reference other objects is knowing that they need to be traced during garbage collection. The simplest way to do this is what Julia does: heap allocate the objects and "root" them using "shadow stack" which is pushed and popped in sync with the actual stack. This introduces a fair bit of overhead and forces these objects to be heap allocated.
A more sophisticated approach that avoids the overhead of a shadow stack and heap allocation is to stack allocated these objects and then scan the stack which doing garbage collection and follow references from objects in the stack to objects on the heap. However, this requires knowing which objects in the stack are pointers to objects on the heap – in general, non heap-allocated objects are not guaranteed to be kept intact or contiguous in registers or the stack. One approach to doing this is called "conservative stack scanning" which entails assuming during gc that any value on the stack which looks like it could be a pointer to an object on the heap actually is. That approach has been successfully used in applications like Safari's JavaScript engine, but it's not without it's challenges. We've contemplated using conservative stack scanning in Julia, and an initial effort to do so was started but the effort was never completed.
References:
https://github.com/JuliaLang/julia/issues/11714
https://github.com/JuliaLang/julia/pull/8134
There are multiple issues/concepts that are frequently mixed together whenever this is brought up.
mutable or non-pointerfree immutable doesn't necessarily mean heap allocation, we already have optimization passes to elide some of the optimizations and are working on improving them further.
The object layout ABI is an user visible behavior and not something an optimization pass can easily change (unless it can prove that the local optimization it wants to do does not escape). The current ABI is that only isbits immutable will be stored inline (and "stack allocated" when used as local variable). There's a fundamental limitation of lifting the requirement of pointerfree-ness for inlined object, i.e. the necessity to handle recursive types. It is impossible to make all types in a reference circle stored inline and the loop has to be broken somewhere if we want to make some of them inlined. I believe we do have a consistent and predictable model to do this though whether this is desireable is another issue.
This is somewhat related to performance but not always. Stored inline means more copy so it's hard to make sure there's no regression if we do the switch.
Edit: And I should also mention that pointer-free is a sufficient condition for cycle free and is easier to compute, which is partly why we are currently using it to break inlining cycles.
GC support. This is basically the easiest part. It's very easy to make GC recognize pointers on the stack. It just needs to be done if we decide to change the object layout ABI.
Edit: And I should add that "GC support" is needed because we currently only support a limited / simple stack layout for object reference (i.e. an array of pointers). It's this that needs to be improved.
I thought one of the big features of Rust is being a systems language comparable to C but with a garbage collector. If this is the case, why do you need to return values of a static size (or use Box from what I gather)?
Why does Rust need to return static sizes?
Every value in every language needs to have a static size. That's how the compiler / interpreter / runtime / virtual machine / hardware knows how to access the bits that make up the value.
In many languages, every value is comparable to a Rust Box, so they all take up one or two pointer's worth of space. The statically-known size for those values allows a layer of indirection which can point to something with a runtime-determined size.
In Rust (and C, C++, probably other system languages), you can also directly store arbitrary values on the stack, unboxed. In these cases, you still need to know the size that the value will occupy.
This is a simplification, as some languages allow certain specific values to reside on the stack, while others "embed" certain value types inside of the fixed-size indirection. Tricks like these are usually for performance reasons.
but with a garbage collector
Rust does not have a garbage collector. It does have smart pointers that deallocate resources when the pointer goes out of scope.
Box is the obvious smart pointer, but there's also Rc and Arc.
I started writing my own scripting language over the most recent weekend for both the learning experience and for my resume when I graduate high school. So far things have gone great, I can parse variables with basic types (null, boolean, number, and string) and mathematical expressions with operator precedence, and have a rudimentary mark and sweep garbage collector in place (after completing the mark/sweep collector I will implement a generational garbage collector, I know naive mark/sweep isn't very fast). I am unsure how to store the referenced objects for the garbage collector, though. As of now I have a class GCObject that stores a pointer to the it's memory and whether it is marked or not. Should I store a linked list to it's referenced objects in the class? I have looked at garbage collectors from other languages but I see no linked lists of references per GCObject, so it is confusing me.
TLDR: How do I store objects that are referenced by other objects in a mark and sweep garbage collector? Do I just store linked lists of objects in all my GCObjects?
Thanks guys.
You generally don't store the references to an object in anything but the locations at which those references naturally occur. During the mark operation, you don't need to know which references point to an object; rather, you need to know which references an object (or root) contains, so you can recursively mark those objects.
You also need, for the sweep phase, a way to iterate through all objects so you can finalise any unreferenced objects and return their storage to the allocation pool. How you would do this exactly depends on your general purpose allocator - you probably want to write a custom one.
(I'm assuming you don't want to do compaction - that's a whole lot more complicated).