When searching for an answer, I found this question, however there is no mention of static lifetime objects. Can the method mentioned in this answer (calling drop() on the object) be used for static lifetime objects?
I was imagining a situation like a linked list. You need to keep nodes of the list around for (potentially) the entire lifetime of the program, however you also may remove items from the list. It seems wasteful to leave them in memory for the entire execution of the program.
Thanks!
No. The very point of a static is that it's static: It has a fixed address in memory and can't be moved from there. As a consequence, everybody is free to have a reference to that object, because it's guaranteed to be there as long as the program is executing. That's why you only get to use a static in the form of a &'static-reference and can never claim ownership.
Besides, doing this for the purpose of memory conservation is pointless: The object is baked into the executable and mapped to memory on access. All that could happen is for the OS to relinquish the memory mapping. Yet, since the memory is never allocated from the heap in the first place, there is no saving to be had.
The only thing you could do is to replace the object using unsafe mutable access. This is both dangerous (because the compiler is free to assume that the object does not in fact change) and pointless, due to the fact that the memory can't be freed, as it's part of the executable's memory mapping.
Related
As you know, both Box::into_raw() and Box::leak() consume the current Box and lose ownership of the memory.
The two just seem to have different types of return values, what exactly is the other difference between them?
How about typical application scenarios?
into_raw is typically used for FFI to get a pointer that can be sent to the other language, and is usually matched with a later call to from_raw to reclaim ownership and free the memory.
leak is typically used to get a 'static reference to satisfy some API requirement and is usually kept until the program exits.
I have read that Rust's compiler "inserts" memory management code during compile time, and this sounds kind of like "compile-time garbage collection".
What is the difference between these two ideas?
I've seen What does Rust have instead of a garbage collector? but that is about runtime garbage collection, not compile-time.
Compile-time garbage collection is commonly defined as follows:
A complementary form of automatic memory management is compile-time memory management (CTGC), where the decisions for memory management are taken at compile-time instead of at run-time. The compiler determines the life-time of the variables that are created during the execution of the program, and thus also the memory that will be associated with these variables. Whenever the compiler can guarantee that a variable, or more precisely, parts of the memory resources that this variable points to at run-time, will never ever be accessed beyond a certain program instruction, then the compiler can add instructions to deallocate these resources at that particular instruction without compromising the correctness of the resulting code.
(From Compile-Time Garbage Collection for the Declarative Language Mercury by Nancy Mazur)
Rust handles memory by using a concept of ownership and borrow checking. Ownership and move semantics describe which variable owns a value. Borrowing describes which references are allowed to access a value. These two concepts allow the compiler to "drop" the value when it is no longer accessible, causing the program to call the dtop method from the Drop trait).
However, the compiler itself doesn't handle dynamically allocated memory at all. It only handles drop checking (figuring out when to call drop) and inserting the .drop() calls. The drop implementation is responsible for determining what happens at this point, whether that is deallocating some dynamic memory (which is what Box's drop does, for example), or doing anything else. The compiler therefore never really enforces garbage collection, and it doesn't enforce deallocating unused memory. So we can't claim that Rust implements compile-time garbage collection, even if what Rust has is very reminiscent of it.
It seems like Box.clone() copies the heap memory. As I know, Box will get destructed after it gets out of its scope, as well as the memory area it is pointing to.
So I'd like to ask a way to create more than one Box object pointing to the same memory area.
By definition, you shall not.
Box is explicitly created with the assumption that it is the sole owner of the object inside.
When multiple owners are required, you can use instead Rc and Arc, those are reference-counted owners and the object will only be dropped when the last owner is destroyed.
Note, however, that they are not without downsides:
the contained object cannot be mutated without runtime checks; if mutation is needed this requires using Cell, RefCell or some Mutex for example,
it is possible to accidentally form cycles of objects, and since Rust has no Garbage Collector such cycles will be leaked.
I've been playing around with glib, which
utilizes reference counting to manage memory for its objects;
supports multiple threads.
What I can't understand is how they play together.
Namely:
In glib each thread doesn't seem to increase refcount of objects passed on its input, AFAIK (I'll call them thread-shared objects). Is it true? (or I've just failed to find the right piece of code?) Is it a common practice not to increase refcounts to thread-shared objects for each thread, that shares them, besides the main thread (responsible for refcounting them)?
Still, each thread increases reference counts for the objects, dynamically created by itself. Should the programmer bother not to give the same names of variables in each thread in order to prevent collision of names and memory leaks? (E.g. on my picture, thread2 shouldn't crate a heap variable called output_object or it will collide with thread1's heap variable of the same name)?
UPDATE: Answer to (question 2) is no, cause the visibility scope of
those variables doesn't intersect:
Is dynamically allocated memory (heap), local to a function or can all functions in a thread have access to it even without passing pointer as an argument.
An illustration to my questions:
I think that threads are irrelevant to understanding the use of reference counters. The point is rather ownership and lifetime, and a thread is just one thing that is affected by this. This is a bit difficult to explain, hopefully I'll make this clearer using examples.
Now, let's look at the given example where main() creates an object and starts two threads using that object. The question is, who owns the created object? The simple answer is that main() and both threads share this object, so this is shared ownership. In order to model this, you should increment the refcounter before each call to pthread_create(). If the call fails, you must decrement it again, otherwise it is the responsibility of the started thread to do that when it is done with the object. Then, when main() terminates, it should also release ownership, i.e. decrement the refcounter. The general rule is that when adding an owner, increment the refcounter. When an owner is done with the object, it decrements the refcounter and the last one destroys the object with that.
Now, why does the the code not do this? Firstly, you can get away with adding the first thread as owner and then passing main()'s ownership to the second thread. This will save one increment/decrement operation. This still isn't what's happening though. Instead, no reference counting is done at all, and the simple reason is that it isn't used. The point of refcounting is to coordinate the lifetime of a dynamically allocated object between different owners that are peers. Here though, the object is created and owned by main(), the two threads are not peers but rather slaves of main. Since main() is the master that controls start/stop of the threads, it doesn't have to coordinate the lifetime of the object with them.
Lastly, though that might be due to the example-ness of your code, I think that main simply leaks the reference, relying on the OS to clean up. While this isn't beautiful, it doesn't hurt. In general, you can allocate objects once and then use them forever without any refcounting in some cases. An example for this is the main window of an application, which you only need once and for the whole runtime. You shouldn't repeatedly allocate such objects though, because then you have a significant memory leak that will increase over time. Both cases will be caught by tools like valgrind though.
Concerning your second question, concerning the heap variable name clash you expect, it doesn't exist. Variable names that are function-local can not collide. This is not because they are used by different threads, but even if the same function is called twice by the same thread (think recursion!) the local variables in each call to the function are distinct. Also, variable names are for the human reader. The compiler completely eradicates these.
UPDATE:
As matthias says below, GObject is not thread-safe, only reference counting functions are.
Original content:
GObject is supposed to be thread safe, but I've never played with that myself…
If I have a garbage collector that tracks every object allocated and deallocates them as soon as they no longer have usable references to them can you still have a memory leak?
Considering a memory leak is allocations without any reference isn't that impossible or am I missing something?
Edit: So what I'm counting as a memory leak is allocations which you no longer have any reference to in the code. Large numbers of accumulating allocations which you still have references to aren't the leaks I'm considering here.
I'm also only talking about normal state of the art G.C., It's been a while but I know cases like cyclical references don't trip them up. I don't need a specific answer for any language, this is just coming from a conversation I was having with a friend. We were talking about Actionscript and Java but I don't care for answers specific to those.
Edit2: From the sounds of it, there doesn't seem to be any reason code can completely lose the ability to reference an allocation and not have a GC be able to pick it up, but I'm still waiting for more to weigh in.
If your question is really this:
Considering a memory leak is allocations without any reference isn't
that impossible or am I missing something?
Then the answer is "yes, that's impossible" because a properly implemented garbage collector will reclaim all allocations that don't have active references.
However, you can definitely have a "memory leak" in (for example) Java. My definition of a "memory leak" is an allocation that still has an active reference (so that it won't be reclaimed by the garbage collector) but the programmer doesn't know that the object isn't reclaimable (ie: for the programmer, this object is dead and should be reclaimed). A simple example is something like this:
ObjectA -> ObjectB
In this example, ObjectA is an object in active use in the code. However, ObjectA contains a reference to ObjectB that is effectively dead (ie: ObjectB has been allocated and used and is now, from the programmer's perspective, dead) but the programmer forgot to set the reference in ObjectA to null. In this case, ObjectB has been "leaked".
Doesn't sound like a big problem, but there are situations where these leaks are cumulative. Let's imagine that ObjectA and ObjectB are actually instances of the same class. And this problem that the programmer forgot to set the reference to null happens every time such an instance is used. Eventually you end up with something like this:
ObjectA -> ObjectB -> ObjectC -> ObjectD -> ObjectE -> ObjectF -> ObjectG -> ObjectH -> etc...
Now ObjectB through ObjectH are all leaked. And problems like this will (eventually) cause your program to crash. Even with a properly implemented garbage collector.
To decide whether a program has a memory leak, one must first define what a leak is. I would define a program as having a memory leak if there exists some state S and series of inputs I such that:
If the program is in state `S` and it receives inputs `I`, it will still be in state `S` (if it doesn't crash), but...
The amount of memory required to repeat the above sequence `N` times will increase without bound.
It is definitely possible for programs that run entirely within garbage-collected frameworks to have memory leaks as defined above. A common way in which that can occur is with event subscriptions.
Suppose a thread-safe collection exposes a CollectionModified event, and the IEnumerator<T> returned by its IEnumerable<T>.GetEnumerator() method subscribes to that event on creation, and unsubscribes on Dispose; the event is used to allow enumeration to proceed sensibly even when the collection is modified (e.g. ensuring that objects are in the collection continuously throughout the enumeration will be returned exactly once; those that exist during part of it will be returned no more than once). Now suppose a long-lived instance of that collection class is created, and some particular input will cause it to be enumerated. If the CollectionModified event holds a strong reference to every non-disposed IEnumerator<T>, then repeatedly enumerating the collection will create and subscribe an unbounded number of enumerator objects. Memory leak.
Memory leaks don't just depend how efficient a garbage collection algorithm is, if your program holds on to object references which have long life time, say in an instance variable or static variable without being used, your program will have memory leaks.
Reference count have a known problem of cyclic refernces meaning
Object 1 refers to Object 2 and Object 2 refers to Object 1
but no one else refers to object 1 or Object 2, reference count algorithm will fail in this scenario.
Since you are working on garbage collector itself, its worth reading about different implementation strategies.
You can have memory leaks with a GC in another way: if you use a conservative garbage collector that naively scans the memory and for everything that looks like a pointer, doesn't free the memory it "points to", you may leave unreachable memory allocated.