I just gone though memory leaks articles but not getting Why open connections or unclosed resources can not be garbage collected in jvm? Can someone please help with example. Thanks in advance.
The Java garbage collector isn't required to ever reclaim an object, even when it's unreachable. In practice, it's usually reclaimed eventually, but this can take hours (or days, or ∞), depending on the application and garbage collection activity. Read up on generational garbage collection to learn more about this.
The first JVM only supported a conservative garbage collector. There's a bunch of stuff online regarding this design as well, but the main caveat when using one of these type of collectors is that it might never identify an object as being unreachable even when it truly is.
Finally, there's the "epsilon" collector option, which never runs garbage collection at all. This is just used for performance testing and for short-lived Java processes. Once a process exits, the open resources get closed by the operating system anyhow.
Related
Suppose that an object on the heap goes out of scope. Why can't the program free the memory right after the scope ends? Or, if we have a pointer to an object that is replaced by the address to a new object, why can't the program deallocate the old one before assigning the new one? I'm guessing that it's faster not to free it immediately and instead have the freeing be done asynchronously at a later point in time, but I'm not really sure.
Why is garbage collection necessary?
It is not strictly necessary. Given enough time and effort you can always translate a program that depends on garbage collection to one that doesn't.
In general, garbage collection involves a trade-off.
On the one hand, garbage collection allows you to write an application without worrying about the details of memory allocation and deallocation. (And the pain of debugging crashes and memory leaks caused by getting the deallocation logic wrong.)
The downside of garbage collection is that you need more memory. A typical garbage collector is not efficient if it doesn't have plenty of spare space1.
By contrast, if you do manual memory management, you can code your application to free up heap objects as soon as they are no longer used. Furthermore, you don't get awkward "pauses" while the GC is doing its thing.
The downside of manual memory management is that you have to write the code that decides when to call free, and you have to get it correct. Furthermore, if you try to manage memory by reference counting:
you have the cost of incrementing and decrementing ref counts whenever pointers are assign or variables go out of scope,
you have to deal with cycles in your data structures, and
it is worse when your application is multi-threaded and you have to deal with memory caches, synchronization, etc.
For what it is worth, if you use a decent garbage collector and tune it appropriately (e.g. give it enough memory, etc) then the CPU costs of GC and manual storage management are comparable when you apply them to a large application.
Reference:
"The measured cost of conservative garbage collection" by Benjamin Zorn
1 - This is because the main cost of a modern collector is in traversing and dealing with the non-garbage objects. If there is not a lot of garbage because you are being miserly with the heap space, the GC does a lot of work for little return. See https://stackoverflow.com/a/2414621/139985 for an analysis.
It's more complicated, but
1) what if there is memory pressure before the scope is over? Scope is only a language notion, not related to reachability. So an object can be "freed" before it goes out of scope ( java GCs do that on regular basis). Also, if you free objects after each scope is done, you might be doing too little work too often
2) As far as the references go, you are not considering that the reference might have hierarchies and when you change one, there has to be code that traverses those. It might not be the right time to do it when that happens.
In general, there is nothing wrong with such a proposal that you describer, as a matter of fact this is almost exactly how Rust programming language works, from a high level point of view.
According to google, V8 uses an efficient garbage collection by employing a "stop-the-world, generational, accurate, garbage collector". Part of the claim is that the V8 stops program execution when performing a garbage collection cycle.
An obvious question is how can you have an efficient GC when you pause program execution?
I was trying to find more about this topic as I would be interested to know how does the GC impacts the response time when you have possibly tens of thounsands requests per second firing your node.js server.
Any expert help, personal experience or links would be greatly appreciated
Thank you
"Efficient" can mean several things. Here it probably refers to high throughput. When looking at response time, you're more interested in latency, which could indeed be worse than with alternative GC strategies.
The main alternatives to stop-the-world GCs are
incremental GCs, which need not finish a collection cycle before handing back control to the mutator1 temporarily, and
concurrent GCs which (virtually) operate at the same time as the mutator, interrupting it only very briefly (e.g. to scan the stack).
Both need to perform extra work to be correct in the face of concurrent modification of the heap (e.g. if a new object is created and attached to an already-scanned object, this new reference must be noticed). This impacts total throughput, i.e., it takes longer to actually clean the entire heap. The upside is that they do not (usually) interrupt the program for very long, if at all, so latency is low(er).
Although the V8 documentation still mentions a stop-the-world collector, it seems that an the V8 GC is incremental since 2011. So while it does stop program execution once in a while, it does not 2 stop the program for however long it takes to scan the entire heap. Instead it can scan for, say, a couple milliseconds, and let the program resume.
1 "Mutator" is GC terminology for the program whose heap is garbage collected.
2 At least in principle, this is probably configurable.
As far as I know GC is used only when JVM needs more memory, but I'm not sure about it. So, please, someone suggest an answer to this question.
The garbage collection algorithm of Java is as I understand, pretty complex and not as straightforward. Also, there is more than algorithm available for GC, which can be chosen at VM launchtime with an argument passed to the JVM.
There's a FAQ about garbage collection here: http://www.oracle.com/technetwork/java/faq-140837.html
Oracle has also published an article "Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine" which contains deep insights into garbage collection, and will probably help you understand the matter better: http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html
The JVM and java specs don't say anything about when garbage collection occurs, so its entirely up to the JVM implementors what policies they wish to use. Each JVM should probably have some documention about how it handles GC.
In general though, most JVMs will trigger GC when a memory allocation pushes the total amount of allocated memory above some threshold. There may be mulitple levels of gc (full vs partial/generational) that occur at different thresholds.
Garbage Collection is deliberately vaguely described in the Java Language Specification, to give the JVM implementors best working conditions to provide good garbage collectors.
Hence garbage collectors and their behaviour are extremely vendor dependent.
The simplest but least fun is the one that stops the whole program when needing to clean. Others are more sophisticated and run quietly along your program cleaning up as you go more or less aggressively.
The most fun way to investigate garbage collection is to run jvisualvm in the Sun 6 JDK. It allows you to see many, many internal things many relevant to garbage collection.
https://visualvm.dev.java.net/monitor_tab.html (but the newest version has plenty more)
In the older days garbage collector were empirical in nature. At some set interval or based on certain condition they would kick in and examine each of the object.
Modern days collectors are smarter in the sense that they differentiate based on the fact that objects are different lifespan. The objects are differentiated between young generation objects and tenured generation objects.
Memory is managed in pools according to generation. Whenever the young generation memory pool is filled, a minor collection happens. The surviving objects are moved to tenured generation memory pool. When the tenured generation memory pool gets filled a major collection happens. A third generation is kept which is known as permanent generation and may contain objects defining classes and methods.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I don't understand garbage collection so good, then I want to know, why it's so important to a language and to the developer?
Many other answers have stated that garbage collection can help to prevent memory leaks but, surprisingly, nobody seems to have mentioned the most important benefit that GC facilitates memory safety. This means that most garbage collected environments completely abstract away the notion of memory locations (i.e. raw pointers) and, consequently, eliminate a major class of bugs.
For example, a common mistake with manual memory management is to accidentally free a value slightly too early and continue to use the memory at that location after it has been freed. This can be extremely difficult to debug because freed memory might not be reallocated and, consequently, seemingly valid operations can be performed on the freed memory that can only fail sporadically with corruption or memory access violations or segmentation faults later in the program's execution, often with no direct link back to the offending code. This class of bugs simply do not exist in most garbage collected environments.
Garbage Collection is a part of many modern languages that attempts to abstract the disposal and reallocation of memory with less direction intervention by the developer.
When you hear talk of "safe" objects, this usually refers to something whose memory can be automatically reallocated by the Garbage Collector after an object falls out of scope, or is explicitly disposed.
While you can write the same program without a garbage collector to help manage memory usage, abstracting this away lets the developer think about more high level things and deliver value to the end user more quickly and efficiently without having to necessarily concentrate as much on lower level portions of the program.
In essence the developer can say
Give me a new object
..and some time later when the object is no longer being used (falls out of scope) the developer does not have to remember to say
throw this object away
Developers are lazy (a good virtue) and sometimes forget things. When working with GC properly, it's okay to forget to take out the trash, the GC won't let it pile up and start to smell.
Garbage Collection is a form of automatic memory management. It is a special case of resource management, in which the limited resource being managed is memory.
Benefits for the programmer is that garbage collection frees the programmer from manually dealing with memory allocation and deallocation.
The bottom line is that garbage collection helps to prevent memory leaks. In .NET, for example, when nothing references an object, the resources used by the object are flagged to be garbage collected. In unmanaged languages, like C and C++, it was up to the developer to take care of cleaning up.
It's important to note, however, that garbage collection isn't perfect. Check out this article on a problem that occurred because the developers weren't aware of a large memory leak.
In many older and less strict languages deallocating memory was hard-coded into programs by the programmer; this of course will cause problems if not done correctly as the second you reference memory that hasn't been deallocated your program will break. To combat this garbage collection was created, to automatically deallocate memory that was no longer being used. The benefits of such a system is easy to see; programs become far more reliable, deallocating memory is effectively removed from the design process, debugging and testing times are far shorter and more.
Of course, you don't get something for nothing. What you lose is performance, and sometimes you'll notice irregular behaviour within your programs, although nowadays with more modern languages this rarely is the case. This is the reason many typical applications are written in Java, it's quick and simple to write without the trauma of chasing memory leaks and it does the job, it's perfect for the world of business and the performance costs are little with the speed of computers today. Obviously some industries need to manage their own memory within their programs (the Games industry) for performance reasons, which is why nearly all major games are written in C++. A lecturer once told me that if every software house was in the same area, with a bar in the middle you'd be able to tell the game developers apart from the rest because they'd be the ones drinking heavily long into the night.
Garbage collection is one of the features required to allow the automatic management of memory allocation. This is what allows you to allocate various objects, maybe introduce other variables referencing or containing these in a fashion or other, and yet never worry about disposing of the object (when it is effectively not in use anymore).
The garbage collection, specifically takes care of "cleaning up" the heap(s) where all these objects are found, by removing unused objects an repacking the others together.
You probably hear a lot about it, because this is a critical function, which happens asynchronously with the program and which, if not handled efficiently can produce some random performance lagging in the program, etc. etc. Nowadays, however the algorithms related to the memory management at-large and the GC (garbage collection) in particular are quite efficient.
Another reason why the GC is sometimes mentioned is in relation to the destructor of some particular object. Since the application has no (or little) control over when particular objects are Garbage-Collected (hence destroyed), it may be an issue if an object waits till its destructor to dispose of some resource and such. That is why many objects implement a Dispose() method, which allow much of that clean-up (of the object itself) to be performed explicitly, rather than be postponed till the destructor is eventually called from the GC logic.
Automatic garbage collection, like java, reuses memory leaks or memory that is no longer being used, making your program efficient. In c++, you have to control the memory yourself, and if you lose access to a memory, then that memory can no longer be used, wasting space.
This is what I know so far from one year of computer science and using/learning java and c++.
Because someone can write code like
consume(produce())
without caring about cleanup. Just like in our current society.
I am trying to profile a specific page of my ASP.NET site to optimize memory usage, but the nature of .NET as a Garbage Collected language is making it tough to get a true picture of what how memory is used and released in the program.
Is there a perfmon counter or other method for profiling that will allow me to see not only how much memory is allocated, but also how much has been released by the program and is just waiting for garbage collection?
Actually nothing in the machine really knows what is waiting for garbage collection: garbage collection is precisely the process of figuring that out and releasing the memory corresponding to dead objects. At best, the GC will have that information only on some very specific instants in its cycle. The detection and release parts are often interleaved (this depends on the GC technology) so it is possible that the GC never has a full count of what could be freed.
For most GC, obtaining such an information is computationally expensive. If you are ready to spend a bit of CPU time on it (it will not be transparent to the application) then you can use GC.Collect() to force the GC to run, immediately followed by a call to GC.GetTotalMemory() to know how much memory has survived the GC. Note that forcing the GC could induce a noticeable pause, and may also decrease overall performance.
This is the "homemade" method; for a more serious analysis, try a dedicated profiler.
The best way that I have been able to profile memory is to use ANTS Profiler from RedGate. You can view a snapshot, what stage of the lifecycle it is in and more. Including actual object values.