I pass the options as
{
...
chunk: () => this.chunk(results,parser),
complete: () => this.complete(results,parser)
}
And this doesn't leak memory when worker is false but leaks memory when worker is true. The leak is because the closures are somehow retained. Looking at the papaparse code it seems that the callback functions are copied to the worker object as worker.userXYZ functions. However, at the end the worker is terminated and deleted. So, ideally the closures should have been freed up too.
Is there any extra step required when using the worker option so the memory is freed up properly?
This seems to be related to Worker objects are not freed
I had a large object held up via the chunk function which refers to this. So, by setting the reference to null in complete freed up the memory.
Related
It is written on the http://psy-lob-saw.blogspot.com/2015/12/safepoints.html
A Java thread is at a safepoint while executing JNI code. Before
crossing the native call boundary the stack is left in a consistent
state before handing off to the native code. This means that the
thread can still run while at a safepoint.
How is it possible? After all, I can pass a object's reference to JNI.
In JNI I can set a field in that object.
It is clear that it can't be collected (we have a local reference). But, it can moved to old generation by a GC during full gc collection.
So, we have the following situation:
GC collector: | Thread executing JNI code
compact old generation | modify object fields that can be
and move object from young generation | moved now! A catastrophe.
to old generation. |
How JVM deals with that?
Almost every JNI call has a safepoint guard. Whenever you invoke a JNI function from a native method, a thread switches from in_native to in_vm state. A part of this transition is a safepoint check.
See ThreadStateTransition::transition_from_native() which calls JavaThread::check_safepoint_and_suspend_for_native_trans(thread)
// Slow path when the native==>VM/Java barriers detect a safepoint is in
// progress or when _suspend_flags is non-zero.
// Current thread needs to self-suspend if there is a suspend request and/or
// block if a safepoint is in progress.
That is, a thread calling JNI function while GC is active will be suspended until GC completes.
In my UMDF driver i have a IWDFMemory packed inside a CComPtr
CComPtr<IWDFMemory> memory;
The documentation of CComPtr says, If a CComPtr object gets out of scope, it gets automagically freed. That means this code should not create any memory leaks:
void main()
{
CComPtr<IWDFDriver> driver = /*driver*/;
/*
driver initialisation
*/
{
// new scope starts here
CComPtr<IWDFMemory> memory = NULL;
driver->CreateWdfMemory(0x1000000, NULL, NULL, &memory);
// At this point 16MB memory have been allocated.
// I can verify this by the task manager.
// scope ends here
}
// If I understand right the memory I allocated in previous scope should already
// be freed at this point. But in the task manager I still can see the 16 MB
// memory used by the process.
}
Also if I manually assign NULL to memory or call memory.Release() before scope end the memory does not get freed. I am wondering what is happening here?
According to MSDN:
If NULL is specified in the pParentObject parameter, the driver object
becomes the default parent object for the newly created memory object.
If a UMDF driver creates a memory object that the driver uses with a
specific device object, request object, or other framework object, the
driver should set the memory object's parent object appropriately.
When the parent object is deleted, the memory object and its buffer
are deleted.
Since you do indeed pass NULL, the memory won't be released until the CComPtr<IWDFDriver> object is released.
Heap allocations are a bottleneck in my application and I would like to avoid them when sending small tasks to my thread pool.
Can I use a std::packaged_task with a stack allocator? Under which conditions? What are the pros/cons of this choice? Are there better alternatives to avoid heap allocations of std::future's shared state by operator new ?
auto foo() {
arena<1024> buffer;
auto task = std::packaged_task<int()>{
std::allocator_arg_t,
arena_allocator{arena},
[]() -> int { return 5; }
};
auto f = task.get_future(); // is this future and its shared state stack allocated?
thread_pool.push_back(std::move(task));
// I will probably need to block before the stack goes out of scope..
return f.get();
}
Your "I will probably need to block before the stack goes out of scope" comment clearly identifies the only issue here. The only thing you must make sure is that because the task in your sending thread's stack, it has to stay there until your thread pool executes it.
Other than that, there are no issues with using the stack, instead of heap allocation.
I have memory leak, and I know where is it (I think so), but I don't know why it is happening.
Memory leak occurs while load-testing following endpoint (using restify.js server):
server.get('/test',function(req,res,next){
fetchSomeDataFromDB().done(function(err,rows){
res.json({ items: rows })
next()
})
})
I am pretty sure that res object is not disposed (by garbage collector). On every request memory used by app is growing. I have done some additional test:
var data = {}
for(var i = 0; i < 500; ++i) {
data['key'+i] = 'abcdefghijklmnoprstuwxyz1234567890_'+i
}
server.get('/test',function(req,res,next){
fetchSomeDataFromDB().done(function(err,rows){
res._someVar = _.extend({},data)
res.json({ items: rows })
next()
})
})
So on each request I am assigning big object to res object as its attribute. I observed that with this additional attribute memory grows much faster. Memory grows like 100Mb per 1000 requests done during 60 sec. After next same test memory grows 100mb again, and so on. Now when I know that res object is not "released" how I can track what is still keeping reference to res? Let say I will perform heap snapshot - how I can find what is referecing res?
screenshot of heap comparison between 10 requests:
Actually it seems that Instance.DAO is leaking?? this class belongs to ORM that I am using to query DB... What do you think?
One more screen of same coparison sorted by #delta:
It seems more likely that the GC hasn't collected the object yet since you are not leaking res anywhere in this code. Try running your script with the --expose-gc node argument and then set up an interval that periodically calls gc();. This will force the GC to run instead of being lazy.
If after that you find that are leaking memory for sure, you could use tools like the heapdump module to use the Chrome developer heap inspector to see what objects are taking up space.
As the title says, how do two or more threads share memory on the heap that they have allocated? I've been thinking about it and I can't figure out how they can do it. Here is my understanding of the process, presumably I am wrong somewhere.
Any thread can add or remove a given number of bytes on the heap by making a system call which returns a pointer to this data, presumably by writing to a register which the thread can then copy to the stack.
So two threads A and B can allocate as much memory as they want. But I don't see how thread A could know where the memory that thread B has allocated is located. Nor do I know how either thread could know where the other thread's stack is located. Multi-threaded programs share the heap and, I believe, can access one another's stack but I can't figure out how.
I tried searching for this question but only found language specific versions that abstract away the details.
Edit:
I am trying not to be language or OS specific but I am using Linux and am looking at it from a low level perspective, assembly I guess.
My interpretation of your question: How can thread A get to know a pointer to the memory B is using? How can they exchange data?
Answer: They usually start with a common pointer to a common memory area. That allows them to exchange other data including pointers to other data with each other.
Example:
Main thread allocates some shared memory and stores its location in p
Main thread starts two worker threads, passing the pointer p to them
The workers can now use p and work on the data pointed to by p
And in a real language (C#) it looks like this:
//start function ThreadProc and pass someData to it
new Thread(ThreadProc).Start(someData)
Threads usually do not access each others stack. Everything starts from one pointer passed to the thread procedure.
Creating a thread is an OS function. It works like this:
The application calls the OS using the standard ABI/API
The OS allocates stack memory and internal data structures
The OS "forges" the first stack frame: It sets the instruction pointer to ThreadProc and "pushes" someData onto the stack. I say "forge" because this first stack frame does not arise naturally but is created by the OS artificially.
The OS schedules the thread. ThreadProc does not know it has been setup on a fresh stack. All it knows is that someData is at the usual stack position where it would expect it.
And that is how someData arrives in ThreadProc. This is the way the first, initial data item is shared. Steps 1-3 are executed synchronously by the parent thread. 4 happens on the child thread.
A really short answer from a bird's view (1000 miles above):
Threads are execution paths of the same process, and the heap actually belongs to the process (and as a result shared by the threads). Each threads just needs its own stack to function as a separate unit of work.
Threads can share memory on a heap if they both use the same heap. By default most languages/frameworks have a single default heap that code can use to allocate memory from the heap. In unmanaged languages you generally make explicit calls to allocate heap memory. In C, that might be malloc, etc. for example. In managed languages heap allocation is usually automatic and how allocation is done depends on the language--usually through the use of the new operator. but, that depends slightly on context. If you provide the OS or language context you're asking about, I might be able to provide more detail.
A Thread shared with other threads belonging to the same process: its code section, data section and other operating system resources such as open files and signals.
The part you are missing is static memory containing static variables.
This memory is allocated when the program is started, and assigned known adresses (determined at the linking time). All threads can access this memory without exchanging any data runtime, because the addresses are effectively hardcoded.
A simple example might look like this:
// Global variable.
std::atomic<int> common_var;
void thread1() {
common_var = compute_some_value();
}
void thread2() {
do_something();
int current_value = common_var;
do_more();
}
And of course the global value may be a pointer, that can be used to exchange heap memory. The producer allocates some objects, the consumer takes and uses them.
// Global variable.
std::atomic<bool> produced;
SomeData* data_pointer;
void producer_thread() {
while (true) {
if (!produced) {
SomeData* new_data = new SomeData();
data_pointer = new_data;
// Let the other thread know there is something to read.
produced = true;
}
}
}
void consumer_thread() {
while (true) {
if (produced) {
SomeData* my_data = data_pointer;
data_pointer = nullptr;
// Let the other thread know we took the data.
produced = false;
do_something_with(my_data);
delete my_data;
}
}
}
Please note: these are not examples of good concurrent code, but they show the general idea without too much clutter.