Private variables memory allocation - node.js

Given a typical structure like the one below, when are the different variables being freed by the garbage collector?:
'Use strict';
var $ = require('jquery');
var somePrivateVar = new Whatever();
module.exports = functions (){
var someInsideVar = new Whatother();
var someOtherInsideVar = $('.myStuf');
$(window).scroll(function(){
somePrivateVar.MoreStuff();
doSomeStuff(someInsideVar);
someOtherInsideVar.toggle();
});
};
EDITED: the proposed another question is related but not the point of this question. I already know a little about garbage collection. I'm not interested in handling the garbage or avoiding it. I'm interested in how nodejs mounts the modules behind the scenes closure-wise. Put in other words, if you like, how nodejs implements those principles in the first response to the other question to handle efficiently the memory.

A module in node.js is simply a closure that stays alive by the same rules that other closures in Javascript would stay alive. As long as any code within the closure is still reachable by other code, then the closure itself can't be garbage collected. And, modules are also cached by the module loader which means a reference to the module is kept alive in the module cache, even if no other code has retained a reference to the module. You may find it helpful to read this article: How require() Actually Works because it is normal Javascript GC of a scope that determines when a given module can be garbage collected or not. There is no special garbage collection for modules.
So, your module variables will be alive as long as this module closure created when the module was loaded stays referenced. While that closure is alive, the only way that contents of the referenced variables within the module would be freed while the module was loaded is if you clear the contents of those variables manually (e.g. setting to null).
By default, a node module remains alive and loaded in the node module cache (waiting for some other code to require() it in again) even if it is no longer currently being used or referenced by the rest of your code. It can be manually removed from the cache if you so choose. See this answer for details on manually removing a module (and perhaps any modules that it also loads) from the cache: Unloading node code/modules. So, if none of your code has a reference to the module, the module has no live event handlers or callbacks in it and you've manually removed the module from the cache, then the module closure should no longer have any reachable code references into it and the GC can free that whole scope (and thus the module).
A module remains alive as long as any code in it is still reachable (e.g. can still be called by any other code). In your case, this would be the case if any other code still has a reference to the module or as long as the event handler in the module is still alive (could still be invoked) because the event handler callback references code within the module. It is not always clear exactly how smart a given garbage collector will be about knowing when a given event handler is done and can never be invoked again in the future.
In your specific example, the $(window).scroll() event handler (which seems like a bit of a made up example because there's usually no window object in node.js that does scrolling) would theoretically be alive forever until you manually removed the event handler with .off() or until the window object itself is deleted or something like that. So, the references inside that event handler would never go away on their own.
Other event handlers that have a specific lifetime such as an Ajax success handler will be done when the ajax call itself has finished executing and all callbacks have been called. Those event handlers will release any references they hold when they are done. Same for a setTimeout(). It will release any reference it holds when it executes (or when the timer is cancelled).
It is often times hard to predict how smart a garbage collector can be and when it will realize that a given variable is no longer reachable and thus can be garbage collected. Some things are easy to understand such as when a variable goes out of scope and no other references remain to the scope so the entire scope will be GCed. But some things are not so simple such as when a scope is still alive because an event handler in that scope is still alive, but nothing in that particular event handler could actually reference a given variable within that scope. Whether or not the GC will actually try to GC a single variable within that scope in that case is implementation dependent. Things like eval() and constructing new Function objects with code built via string manipulation make it very hard for Javascript to know exactly what could and could not be referenced in the future from a given event handler or callback (since it's possible to construct almost any reference programmatically without the interpreter knowing what you "may" reference in the future). This complicates fine grained garbage collection. What you can count on is whole scope garbage collection (when the whole scope is released). Counting on finer grained GC than that it probably not wise. If you are explicity done with a very large variable within a scope that might last a lot longer, then it's safer to just null out that very large variable so its specific reference to the large data is cleared when you want it to be cleared. This wouldn't be significant for a small string (unless you had tens of thousands of these objects), but might be relevant for a large buffer or very large string.
Edit: It does appear that V8 does garbage collection of individual variables within a scope if those variables themselves are not referenced within any of the code that is still reachable within the closure and there are no uses of eval() in that same code. I have not found any authoritative references on the subject, but have verified that this appears to be the case in testing actual situations.

Related

Does calling `gc()` manually, result in all `finalizers` being executed immediately?

I have some code that I suspect is leaking memory.
As the code uses ccall and maintains significant information held inside pointers,
which are supposed to be free'd by code that is ccalled during finalizers.
In my debugging I am calling gc().
And I want to know if this will immediately trigger all finalizers that are attached to the objects that have moved out of scope
Answers should be concerned only with julie 0.5+.
After the discussion on #Isaiah's answer (deleted), I decided to poke some internals folks and get some clarity on this. As a result, I have it on good authority that when gc() is called at the top level – i.e. not in a local scope – then the following assurance can be relied upon:
if an object is unreachable and you call gc() it’ll be finalized
which is pretty clear cut. The top-level part is significant since when you call gc() in a local scope, local references may or may not be considered reachable, even if they will never be used again.
This assurance does sweep some uncertainty under the carpet of "reachability" since it may not be obvious whether an object is reachable or not because the language runtime may keep references to some objects for various reasons. These reasons should be exhaustively documented, but currently they are not. A couple of notable cases where the runtime holds onto objects are:
The unique instance of a singleton type is permanent and will never be collected or finalized;
Method caches are also permanent, which in particular, means that modules are not freed when you might otherwise expect them to be since method caches keep references to the modules in which they are defined.
Under "normal circumstances" however – which is what I suspect this question is getting at – yes, calling gc() when an object is no longer reachable will cause it to be collected and finalized "immediately", i.e. before the gc() call returns.

Node.js GC with unassigned function constructs

I know it's generally considered best practice in javascript to always assign a new function to a variable, even if it's not used. But in relation to garbage collection in node.js, is v8 able to GC functions that are not assigned to variables or does it make no difference?
As long as all references to a variable (anonymous or assigned) have been destroyed (in this case, by deallocation of the containing function) v8 should garbage collect it.

How can I delete and deallocate OVM objects in SystemVerilog?

I would like to delete an ovm object (and its children) so that I can recreate it with different configs. Is there a way to do this in OVM?
Currently, when I try to create the object a second time with new, I get the following VCS runtime error:
[CLDEXT] Cannot set 'ap' as a child of 'instance', which already has a child by that name.
I realize that I can simply use a different name to "re-create" the instance, but then I'll still have the old instance sitting around and soaking up memory.
OVM is just a SystemVerilog library. That means that all the rules of SystemVerilog apply to OVM. So, yes, you can use new() with OVM. Sometimes it's preferable to use the factory, and sometimes it's preferable to use new() (that's a topic for a different discussion).
SystemVerilog does not have a delete operator or a destructor like C++. Instead, when you are done with an object you just remove all references to it and the garbage collector will clean up the memory. Here's a quote from the SystemVerilog reference manual (IEEE 1800-2009) section 8.7:
SystemVerilog does not require the complex memory allocation and deallocation of C++. Construction of an object is straightforward; and garbage collection, as in Java, is implicit and automatic. There can be no memory leaks or other subtle behaviors, which are so often the bane of C++ programmers.
It's not entirely true that you cannot have a memory leak. You can forget to remove all references to an object and the garbage collector will not know to pick it up. However, you do not have to worry about memory with the same detail as you do in C++.
The particular error you received with id CLDEXT is from ovm_component class. From the message it appears that you attempted to create two components with the same name and the same parent. Components in OVM are typically static. That is, you create and elaborate them once, usually at time 0, and don't delete or add components after that. Because of this model there are no methods in ovm_component to remove child components. So there really isn't a good way to replace a component once it has been instantiated. By the way, this only applies to components. Other types of objects can be re-allocated.
If you feel that you need to replace a component with a different one after time 0 you should re-think the architecture of your testbench. There are probably betters ways to accomplish what you are trying to do without replacing components.
I have only UVM experience but I think OVM is similar. I would have liked to reply to #Victor Lyuboslavsky's comment but I can't add comments.
The issue is with the name 'ap' which evidently has already been used for a child of 'instance'. Use this code instead.
static int instNum = 0;
instance_ap = my_ovm_extended_class::type_id::create
($sformatf ("ap%0d", instNum), this);
The first time an object is created & the handle assigned to 'instance_ap', the object would have the name 'instance.ap0'. The next time the code executes an object called 'instance.ap1', and so on.
As mentioned by other posters this ought to be done only for non-component objects, and components should be static and must be created during/before the build phase & connected to each other during/before the connect phase.
Try assigning null to the object before calling new again.
Unless I see someone else answer this question, I'd say there is no easy way to deallocate objects in OVM framework.
OVM testbenches are static and created when the testbench is created.
When the environment class is instantiated, it will call new(create), build, connect, end_of_elaboration, start_of_simulation, run and check on all components.
By the end of the environment build phase all components must be created.
By the end of the environment connect phase all components must have their TLM ports connected.
Because of these requirements, you can not change components (or port connections) except for during the phase.
As part of the static nature of the testbench environment, every component must have a unique get_full_name() response. This is because string lookups are used to identify components in the hierarchy.
Assigning an object to null should deallocate memory. If there is no other handle pointing to that memory location, then it should get reclaimed.

Can I use [self retain] to hold the object itself in objective-c?

I'm using [self retain] to hold an object itself, and [self release] to free it elsewhere. This is very convenient sometimes. But this is actually a reference-loop, or dead-lock, which most garbage-collection systems target to solve. I wonder if objective-c's autorelease pool may find the loops and give me surprises by release the object before reaching [self release]. Is my way encouraged or not? How can I ensure that the garbage-collection, if there, won't be too smart?
This way of working is very discouraged. It looks like you need some pointers on memory management.
Theoretically, an object should live as long as it is useful. Useful objects can easily be spotted: they are directly referenced somewhere on a thread stack, or, if you made a graph of all your objects, reachable through some path linked to an object referenced somewhere on a thread stack. Objects that live "by themselves", without being referenced, cannot be useful, since no thread can reach to them to make them perform something.
This is how a garbage collector works: it traverses your object graph and collects every unreferenced object. Mind you, Objective-C is not always garbage-collected, so some rules had to be established. These are the memory management guidelines for Cocoa.
In short, it is based over the concept of 'ownership'. When you look at the reference count of an object, you immediately know how many other objects depend on it. If an object has a reference count of 3, it means that three other objects need it to work properly (and thus own it). Every time you keep a reference to an object (except in rare conditions), you should call its retain method. And before you drop the reference, you should call its release method.
There are some other importants rule regarding the creation of objects. When you call alloc, copy or mutableCopy, the object you get already has a refcount of 1. In this case, it means the calling code is responsible for releasing the object once it's not required. This can be problematic when you return references to objects: once you return it, in theory, you don't need it anymore, but if you call release on it, it'll be destroyed right away! This is where NSAutoreleasePool objects come in. By calling autorelease on an object, you give up ownership on it (as if you called release), except that the reference is not immediately revoked: instead, it is transferred to the NSAutoreleasePool, that will release it once it receives the release message itself. (Whenever some of your code is called back by the Cocoa framework, you can be assured that an autorelease pool already exists.)
It also means that you do not own objects if you did not call alloc, copy or mutableCopy on them; in other words, if you obtain a reference to such an object otherwise, you don't need to call release on it. If you need to keep around such an object, as usual, call retain on it, and then release when you're done.
Now, if we try to apply this logic to your use case, it stands out as odd. An object cannot logically own itself, as it would mean that it can exist, standalone in memory, without being referenced by a thread. Obviously, if you have the occasion to call release on yourself, it means that one of your methods is being executed; therefore, there's gotta be a reference around for you, so you shouldn't need to retain yourself in the first place. I can't really say with the few details you've given, but you probably need to look into NSAutoreleasePool objects.
If you're using the retain/release memory model, it shouldn't be a problem. Nothing will go looking for your [self retain] and subvert it. That may not be the case, however, if you ever switch over to using garbage collection, where -retain and -release are no-ops.
Here's another thread on SO on the same topic.
I'd reiterate the answer that includes the phrase "overwhelming sense of ickyness." It's not illegal, but it feels like a poor plan unless there's a pretty strong reason. If nothing else, it seems sneaky, and that's never good in code. Do heed the warning in that thread to use -autorelease instead of -release.

When observe pattern cause GC problems

In a GC enabled language, when observer subscribes to events of subject, actually subject got a reference of observer.
So before drop an observer, it must un-subscribes first. Other wise, because it's still referenced by subject, it will never be garbage collected.
Normally there are 3 solutions:
Manually un-subscribes
Weak Reference.
Both of them cause other problems.
So usually I don't like to use observer patterns, but I still can not find any replacement for that.
I mean, this pattern describes thing in such a natural way that You could hardly find anything better.
What do you think about it?
In this scenario, you can use finalize() in Java. finalize() is a bad idea when you have to release a resource (like a DB connection) because some outside system is affected. In your case, the object which installed the observer will be GC'd during the runtime of your app and then, finalize() will be called and it can unsubscribe the observer.
Not exactly what you want but someone must decide "it's okay to unsubscribe, now". That either happens when your subject goes away (but it should already kill all observers) or the object which installed the observer.
If your app terminates unexpectedly, well, it doesn't hurt that finalize() might not be called in this case.
If you want to remove an observer you should inform the publisher by unsubscribing, first, otherwise it will try to send out events and depending on how it is written, it could crash the app, quietly ignore the error or remove the observer. But, if you open something, close it; if you subscribe, unsubscribe.
The fact that you are not unsubscribing is a bad design, IMO. Don't blame the pattern for a poor implementation.
The observer pattern works well, but if you want to alleviate some of the issues, you could use AOP for the implementation:
http://www.cin.ufpe.br/~sugarloafplop/final_articles/20_ObserverAspects.pdf
Consider the scenario of an object which counts how often some observable thing changes. There are two types of references to the object: (1) those by entities which are interested in the count; (2) those used by the observable thing(s) which aren't really interested in the the count, but need to update it. The entities which are interested in the count should hold a reference to an object which in turn holds a reference to the one that manages the count. The entities that will have to update the count but aren't really interested in it should just hold references to the second object.
If the first object holds a finalizer, it will be fired when the object goes out of scope. That could trigger the second object to unsubscribe, but it should probably not be unsubscribed directly. Unsubscription would probably require acquiring a lock, and finalizers should not wait on locks. Instead, the finalizer of the first object should probably add that object to a linked list maintained using Interlocked.CompareExchange, and some other thread should periodically poll that list for objects needing unsubscription.
Note, btw: If the first object holds a reference to second object, the latter would be guaranteed to exist when the finalizer for the first object runs, but it would not be guaranteed to be in any particular state. The cleanup thread should not try to do anything with it other than unsubscribe.

Resources