When custom class loader becomes gc root? - garbage-collection

Quote from https://www.yourkit.com/docs/java/help/gc_roots.jsp:
There are several kinds of GC roots. One object can belong to more than one kind of root. The root kinds are:
(...)
Held by JVM - objects held from garbage collection by JVM for its purposes. Actually the list of such objects depends on JVM implementation. Possible known cases are: the system class loader, a few important exception classes which the JVM knows about, a few pre-allocated objects for exception handling, and custom class loaders when they are in the process of loading classes.
Under what circumstances my own class loader becomes gc root? How can I stop "process of loading classes"? What can I do if jvm (HotSpot) prevents my class loader from being gc-ed despite not having path to any other gc root?

I don’t think that you should worry about this. When ordinary Java code is in the process of loading classes through your custom class loader, it would be natural that the class loader can’t get garbage collected while the process is ongoing. But in that case, a Java stack frame of the thread using the ClassLoader would be reported as the garbage collection root (at least, if no other roots exist).
Now, when the JVM does the same, e.g. when resolving dependencies, there isn’t necessarily a Java stack frame containing a variable pointing to the class loader, hence, it may report the class loader as being in use by the JVM without reporting one of the other categories, so “Held by JVM” is the category then.
Still, a JVM won’t be within the process of loading a class without a reason. Resolving a class through a particular class loader implies that the class loader would be reachable anyway, directly or indirectly, from the class whose symbols are to be resolved. Being reported as being referenced by a GC root doesn’t preclude being reachable through another chain of references as well. There is a theoretical possibility to encounter a class loader in the process of loading without other chains of references, though, if the reachability changes during the process. But that would be only an issue of the timing of the snapshot, not an indicator for a problem.
But keep in mind that there might be other reasons for getting “Held by JVM”, e.g. a JVM doesn’t need to support class unloading at all (for HotSpot, configurable via options). If class unloading is not supported, every class loader can be considered a gc root, whether being reported as such or not. But there is nothing you can do about this in that case.

Related

Monitor Global Execution Context

let's assume that I have a large system which uses Scala Global Execution Context in many places to execute different Futures. And one day in this system occurred the performance problem (Memory dump analysis pointed out that the reason was too many Futures in GEC). I found the reason quickly because I was lucky, but this caused that I started to think about possibilities to monitor the execution contexts state.
It's not easy and I can't find articles about that topic. I hacked GEC and now it exposes few parameters via JMX (QueuedSubmissionCount, QueuedCount etc), but it is still not enough to tell which class caused the leak.
I thought about a custom wrapper for Execution Context, which will take an additional identifier (class name or something) and counts the number of futures per identifier. But I believe that many programmers have the same problem and maybe you have some better idea?

Does garbage collection lead to bad software design?

I have heard that garbage collection leads to bad software design. Is that true? It is true that we don't care about the lifetime of objects in garbage collected languages, but does that have an effect in program design?
If an object asks other objects to do something on its behalf until further notice, the object's owner should notify it when its services are not required. Having code abandon such objects without notifying them that their services won't be required anymore would be bad design, but that would be just as true in a garbage-collected framework as in a non-GC framework.
Garbage-collected frameworks, used properly, offer two advantages:
In many cases, objects are created for the purpose of encapsulating values therein, and references to the objects are passed around as proxies for that data. Code receiving such references shouldn't care about whether other references exist to those same objects or whether it holds the last surviving references. As long as someone holds a reference to an object, the data should be kept. Once nobody needs the data anymore, it should cease to exist, but nobody should particularly notice.
In non-GC frameworks, an attempt to use a disposed object will usually generate Undefined Behavior that cannot be reliably trapped (and may allow code to violate security policies). In many GC frameworks, it's possible to ensure that attempts to use disposed resources will be trapped deterministically and cannot undermine security.
In some cases, garbage-collection will allow a programmer to "get away with" designs that are sloppier than would be tolerable in a non-GC system. A GC-based framework will also, however, allow the use of many good programming patterns which would could not be implemented as efficiently in a non-GC system. For example, if a program uses multiple worker threads to find the optimal solution for a problem, and has a UI thread which periodically wants to show the best solution found so far, the UI thread would want to know that when it asks for a status update it will get a solution that has been found, but won't want to burden the worker threads with the synchronization necessary to ensure that it has the absolute-latest solution.
In a non-GC system, thread synchronization would be unavoidable, since the UI thread and worker thread would have to coordinate who was going to delete a status object that becomes obsolete while it's being shown. In a GC-based system, however, the GC would be able to tell whether a UI thread was able to grab a reference to a status object before it got replaced, and thus resolve whether the object needed to be kept alive long enough for the UI thread to display it. The GC would sometimes have to force thread synchronization to find all reachable references, but occasional synchronization for the GC may pose less of a performance drain better than the frequent thread synchronization required in a non-GC system.

How do garbage collectors know about references on the stack frame?

What techniques do modern garbage collectors (as in CLR, JVM) use to tell which heap objects are referenced from the stack?
Specifically how can a VM work back from knowing where the stack starts to interpreting all local references to heap objects?
In Java (and likely in the CLR although I know its internals less well), the bytecode is typed with object vs primitive information. As a result, there are data structures in the bytecode that describe which variables in each stack frame are objects and which are primitives. When the GC needs to scan the root set, it uses these StackMapTables to differentiate between references and non-references.
CLR and Java have to have some mechanism like this because they are exact collectors. There are conservative collectors like the boehm collector that treat every offset on the stack as a possible pointer. They look to see if the value (when treated as a pointer) is an offset into the heap, and if so, they mark it as alive.
Take a look at this Artima article from August 1996, Java's Garbage-Collected Heap; especially page 2.
Any garbage collection algorithm must do two basic things. First, it must detect garbage objects. Second, it must reclaim the heap space used by the garbage objects and make it available to the program. Garbage detection is ordinarily accomplished by defining a set of roots and determining reachability from the roots. An object is reachable if there is some path of references from the roots by which the executing program can access the object. The roots are always accessible to the program. Any objects that are reachable from the roots are considered live. Objects that are not reachable are considered garbage, because they can no longer affect the future course of program execution.
In a JVM the root set is implementation dependent but would always include any object references in the local variables. In the JVM, all objects reside on the heap. The local variables reside on the Java stack, and each thread of execution has its own stack. Each local variable is either an object reference or a primitive type, such as int, char, or float. Therefore the roots of any JVM garbage-collected heap will include every object reference on every thread's stack. Another source of roots are any object references, such as strings, in the constant pool of loaded classes. The constant pool of a loaded class may refer to strings stored on the heap, such as the class name, superclass name, superinterface names, field names, field signatures, method names, and method signatures.
Any object referred to by a root is reachable and is therefore a live object. Additionally, any objects referred to by a live object are also reachable. The program is able to access any reachable objects, so these objects must remain on the heap. Any objects that are not reachable can be garbage collected because there is no way for the program to access them.
The article continues to explore different garbage collection strategies, including reference counting collectors, tracing collectors, compacting collectors and copying collectors.
Though this article is old, it still applies today; not much has really changed. There have been performance improvements to the different collection strategies, but no new major advancements.
The Oracle HotSpot JVM, for example, has a new Garbage-First Garbage Collector which is a copying collector with performance tweaks for multi-core processors and large heap sizes (see this answer for more on the G1 Garbage Collector).
Interesting documentation on this topic posted up by the .Net team shortly after the they made CoreCLR open source: Stack Walking

How does MonoTouch garbage collect?

Are details of the MonoTouch garbage collection published anywhere? I am interested in knowing how it works on the iPhone. I'd like to know:
How often it runs, and are there any constraints that might stop it running.
Whether it is completely thread safe, so objects passed from one thread to another are handled properly, of if there are constraints we should be aware of.
If there is any benefit in manually calling the garbage collector before initiating an action that will use memory.
How does it handle low memory notifications, and running out of memory.
Such information would help us understand the stacks and thread information that we have from application logs.
[Edit] I've now found the information at Hans Boehm's site, but that is very generic and lists various options and choices the implementer has, including how threads are handled. Specific MonoTouch information is what I am wanting here.
The garbage collector is the same one used in Mono, the source code is here:
https://github.com/mono/mono/tree/master/libgc
It is completely thread safe, and multi-core safe, which means that multiple threads can allocate objects and it can garbage collect in the presence of multiple threads.
That being said, your question is a little bit tricky, because you are not really asking about the garbage collector when you say "so objects passed from one thread to another are handled property , of if there are constraints that one should be aware of".
That is not really a garbage collector question, but an API question. And this depends vastly on the API that you are calling. The rules are the same than for .NET: instance methods are never thread safe, static methods are thread safe by default. Unless explicitly stated in the API that they are not.
Now with UI APIs like UIKit or CoreGraphics these are not different than any other GUI toolkit available in the world. UI toolkits are not thread safe, so you can not assume that a UILabel created on the main thread can safely be accessed from a thread. That is why you have to call "BeginInvokeOnMainThread" on an NSObject to ensure that any methods that you call on UIKit objects are only executed no the main thread.
That is just on example.
Check http://monotouch.net/Documentation/Threading for more information
Low memory notifications are delivered by the operating system to your UIViewControllers, not to Mono's GC, so you need to take appropriate action in those cases.

Which Qt classes use the disk directly?

I'm trying to write a library to separate all the disk activity out into its own thread, but the documentation doesn't really care about such things.
What I want to accomplish is that aside from startup, all disk activity is asynchronous, and for that, I need to wrap every class that accesses the disk. Here's what I found so far:
QtCore:
QFile
QTemporaryFile
QDir
QFileInfo
QFileSystemWatcher
QDirIterator
QSettings
QtGui:
QFileDialog
QFileSystemModel
QDirModel
I'm sure there are more.
I have a couple of points -
First, when you do this, remember that all GUI objects are based on QWidget, have run in the start-up thread. See http://doc.trolltech.com/4.6/threads-qobject.html which talks about threading. The quote is "Although QObject is reentrant, the GUI classes, notably QWidget and all its subclasses, are not reentrant. They can only be used from the main thread. As noted earlier, QCoreApplication::exec() must also be called from that thread".
This also means that if you need to display information from one of these wrapper classes on the screen, you need to be careful about ownership of objects when you pass information back to the GUI thread. Particularly, anything that is based on QObject.
Second, starting threads carries a run-time cost. So I would suggest that you structure your design to minimize the number of times this wrapper thread class is created and destroyed.
Overall an interesting approach to files. This is one that I'm going to consider for my current application. It may solve some problems I'm having.

Resources