Garbage collection / linked list - garbage-collection

Will the garbage collector (in theory) collect a structure like this?
package main
type node struct {
next *node
prev *node
}
func (a *node) append(b *node) {
a.next = b
b.prev = a
}
func main() {
a := new(node)
b := new(node)
a.append(b)
b = nil
a = nil
}
This should be a linked list. a points to b, b points back to a. When I remove the reference in a and b (the last two lines) the two nodes are not accessible any more. But each node still has a reference. Will the go garbage collector remove these nodes nonetheless?
(Obviously not in the code above, but in a longer running program).
Is there any documentation on the garbage collector that handles these questions?

The set of garbage collector (GC) roots in your program is {a, b}. Setting all of them to nil makes all heap content eligible for collection, because now all of the existing nodes, even though they are referenced, are not reachable from any root.
The same principle guarantees also for example that structures with circular and/or self references get collected once they become not reachable.

The concern you describe is actually a real problem with a simple but little-used garbage collection scheme known as "reference counting." Essentially, exactly as you imply, the garbage collector (GC) counts how many references exist to a given object, and when that number reaches 0, it is GC'd. And, indeed, circular references will prevent a reference counting system from GC-ing that structure.
Instead what many modern GCs do (including Go; see this post) is a process known as mark-and-sweep. Essentially, all top-level references (pointers that you have in the scope of some function) are marked as "reachable," and then all things referenced from those references are marked as reachable, and so on, until all reachable objects have been marked. Then, anything which hasn't been marked is known to be unreachable, and is GC'd. Circular references aren't a problem because, if they aren't referenced from the top-level, they won't get marked.

Related

What happens to the memory that was 'moved into'?

Does rust drop the memory after move is used to reassign a value?
What happens to the string "aaa" in this example?
let mut s = String::from("aaa");
s = String::from("bbb");
I'm guessing that the "aaa" String is dropped - it makes sense, as it is not used anymore. However, I cannot find anything confirming this in the docs. (for example, the Book only provides an explanation of what happens when we assign a new value using move).
I'm trying to wrap my head around the rules Rust uses to ensure memory safety, but I cannot find an explicit rule of what happens in such a situation.
Yes, assignment to a variable will drop the value that is being replaced.
Not dropping the replaced value would be a disaster - since Drop is frequently used for deallocation and other cleanup, if the value wasn't dropped, you'd end up with all sorts of leaks.
Move semantics are implicit here. The data in s is initialized by moving from the String produced by String::from("bbb"). The original data stored in s is dropped by side-effect (the process of replacing it leaves it with no owner, so it's dropped as part of the operation).
Per the destructor documentation (emphasis added):
When an initialized variable or temporary goes out of scope, its destructor is run, or it is dropped. Assignment also runs the destructor of its left-hand operand, if it's initialized. If a variable has been partially initialized, only its initialized fields are dropped.

Point a pointer to local variable created within a function

Here is the code:
var timePointer *[]time.Time
func UpdateHolidayList() error {
//updating logic: pulling list of holidays from API
holidaySlice := make([]time.Time, 0)
//appending some holidays of type time.Time to holidaySlice
//now holidaySlice contains a few time.Time values
timePointer = &holidaySlice
return nil
}
func main() {
//run UpdateHoliday every 7 days
go func() {
for {
UpdateHoliday()
time.Sleep(7 * 3600 * time.Hour)
}
}()
}
I have 4 questions:
holidaySlice is a local variable, is it safe to point a (global) pointer to it?
Is this whole code multi-thread safe?
After pointing timePointer to holidaySlice, can I access the values via timePointer
(If answer to 3. is "yes") The holiday list is constantly changing, so holidaySlice will be different each update. Will the values accessed via timePointer change accordingly then?
holidaySlice is a local variable allocated on the heap. Any variable pointing to the same heap location can access the data structure at that location. Whether or not it is safe depends on how you access it. Even if holidaySlice was not explicitly allocated on the heap, once you make a global variable point to it, Go compiler would detect that it "escapes", so it would allocate that on the heap.
The code is not thread-safe. You are modifying a shared variable (the global variable) without any explicit synchronization, so there is no guarantee on when or if other goroutines will see the updates to that variable.
If you update the contents of the timePointer array without explicit synchronization, there is no guarantee on when or if other goroutines will see those updates. You have to use synchronization primitives like sync/Mutex to delimit read/write access to data structures that can be updated/read by multiple goroutines.
As long as your main does not end timePointer will have access to the pointer that holidaySlice creates - since its heap allocated, compiler will detect its escape and not free the memory.
Nope, absolutely not. Look at the sync package
Yes you can. Just remember to iterate using *timePointer instead of timePointer
It will change - but not accordingly. Since you have not done any synchronization - you have no defined way of knowing what data is stored at slice pointed to by timePointer when it is read.

Compare and swap with and without garbage collector

How does CAS works? How does it work with garbage collector? Where is the problem and how does it work without garbage collector?
I was reading a presentation about CAS and using it on "write rarely, read many" problem and there was said, that use of CAS is convenient while you can use garbage collector, but there is problem (not specified) while you can not use garbage collector.
Can you tell me something about this? If you can sum up principle of CAS at first, it would be appreciated.
Ok, so CAS is an atomic instruction, that is there is special hardware support for it.
Its main use is to not use locks at all when implementing your data structures and other operations, since using locks, if a thread takes a page fault, a cache miss or is being descheduled by the OS for instance the thread takes the lock with it and all the rest of the threads are blocked. This obviously yields serious performance issues.
CAS is the core of lock-free programming and here and here.
CAS basically is the following:
CAS(CURRENT_VALUE, OLD_VALUE, NEW_VALUE) <=>
if CURRENT_VALUE==OLD_VALUE then CURRENT_VALUE = NEW_VALUE
You have a variable (e.g. class variable) and you have no clue if it was modified or not by other threads in the time you read from it and you want to write to it.
CAS helps you here on the write part since this CAS is done atomically (in hardware) and no lock is being implemented there, thus even if your thread goes to sleep the rest of the threads can operate on your data structure.
The issue with CAS on non-GC systems is the ABA problem and an example is the following:
You have a single linked list: HEAD->A->X->Y->Z
Thread 1: let's read A: localA = A; localA_Value = A.Value (let's say 5)
Thread 2: let's delete A: HEAD->A->X->Y->Z
Thread 3: let's add a new node at start (the malloc will find the right spot right were old A was): HEAD->A'->X->Y->Z (A'.Value = 10)
Thread 1 resumes and wants to swap A with B: CAS(localA, A', B) => but this thread expects that if CAS passes the value of A to be 5; wrong: since CAS passes given that localA and A' have the same memory location but localA.Value!=A'.Value => thus the operation shouldn't be performed.
The thing is that in GC enabled systems this will never happen since localA holds a reference to that memory location and thus A' will never get allocated to that memory location.

Are objects accessed indirectly in D?

As I've read all objects in D are fully location independent. How this requirement is achieved?
One thing that comes to my mind, is that all references are not pointers to the objects, but to some proxy, so when you move object (in memory) you just update that proxy, not all references used in program.
But this is just my guess. How it is done in D for real?
edit: bottom line up front, no proxy object, objects are referenced directly through regular pointers. /edit
structs aren't allowed to keep a pointer to themselves, so if they get copied, they should continue to just work. This isn't strictly enforced by the language though:
struct S {
S* lol;
void beBad() {
lol = &this; // this compiler will allow this....
}
}
S pain() {
S s;
s.beBad();
return s;
}
void main() {
S s;
s = pain();
assert(s.lol !is &s); // but it will also move the object without notice!
}
(EDIT: actually, I guess you could use a postblit to update internal pointers, so it isn't quite without notice. If you're careful enough, you could make it work, but then again, if you're careful enough, you can shoot between your toes without hitting your foot too. EDIT2: Actually no, the compiler/runtime is still allowed to move it without even calling the postblit. One example of where this happens is if it copies a stack frame to the heap to make a closure. The struct data is moved to a new address without being informed. So yeah. /edit)
And actually, that assert isn't guaranteed to pass, the compiler might choose to call pain straight on the local object declared in main, so the pointer would work (though I'm not able to force this optimization here for a demo, generally, when you return a struct from a function, it is actually done via a hidden pointer the caller passes - the caller says "put the return value right here" thus avoiding a copy/move in some cases).
But anyway, the point just is that the compiler is free to copy or not to copy a struct at its leisure, so if you do keep the address of this around in it, it may become invalid without notice; keeping that pointer is not a compile error, but it is undefined behavior.
The situation is different with classes. Classes are allowed to keep references to this internally since a class is (in theory, realized by the garbage collector implementation)) an independent object with an infinite lifetime. While it may be moved (such as be a moving GC (not implemented in D today)), if it is moved, all references to it, internal and external, would also be required to be updated.
So classes can't have the memory pulled out from under them like structs can (unless you the programmer take matters into your own hands and bypass the GC...)
The location independent thing I'm pretty sure is referring only to structs and only to the rule that they can't have pointers to themselves. There's no magic done with references or pointers - they indeed work with memory addresses, no proxy objects.

Relation between type safety and practicality of garbage collection

What are the feature of C Programming language that break the type-safety and prohibits practical garbage collection from being added to the language? Explain.
Firstly, I don't understand the relationship between type-safety and garbage collection. I'd appreciate if someone can help me with that.
You can do garbage collection in C. It is called conservative garbage collection. The trick is to treat any data that looks like a pointer as if it were in fact a pointer, and not reclaim any memory that is reachable through it. There are two problems: first, you cannot move data around (i.e., compaction), because of the uncertainty of whether something that looks like a pointer is in fact a pointer (so updating it to point to a new location could result in data corruption).
The type-safety problem is that it is possible for a C programmer to store a pointer to an int, perform math on it, and then restore the pointer (as in: ptrdiff_t d = (ptrdiff_t) ptr; ptr = NULL; d += 42; /* GC here would be bad */ d -= 42;) This pointer hiding could lead a conservative garbage collector to prematurely reclaim memory that was only reachable through that pointer.
There is no relation between type safety and garbage collection whatsoever. For example, there is a language called Ada (not much popular these days though) which is very type safe, but doesn't feature garbage collector. At the same time Javascript is a dynamic language (i.e. no type safety at all) but has garbage collector in most implementations.

Resources