versions we are using
NodeOracleDb Version: 4.1.0 / 5.1.0/ 5.4.0
platform linux
version v18.1.0
arch x64
version 5.4.0
oracleclient 19.8.0.0.0 oracle
DB: 19.0.0.0.ru-2021-10.rur-2021-10.r1
Detailed Description
When we use a connection pool(max 30 min 10) and if we are fetching records from a table which has more than 250+ columns where few are clobs. The memory keeps on increasing and this memory increase is in RSS. This memory never comes down and the app crashes with OOM after a certain time. We were initially using the loopback 3 framework with oracle client, but we were able to simulate the same using the default connectionPool code given in the sample files (https://github.com/oracle/node-oracledb/blob/main/examples/connectionpool.js). What we were also able to find out was this was coming mainly when we are using connection pool.
Include a runnable Node.js script that shows the problem.
https://github.com/oracle/node-oracledb/files/9132973/oracleLeak.zip
This has insert script and create table scripts. Please run them first and then modify sample.js for DB connection.
We tested this with min pool as 10, max pool as 30 and UV_THREADPOOL_SIZE as 30
We even tried valgrind with 5.4.0 and we found that the issue is visible with definitely lost.
==36361== 903,607 (32,256 direct, 871,351 indirect) bytes in 1 blocks are definitely lost in loss record 4,608 of 4,615
==36361== at 0x4C03A83: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==36361== by 0x1181F179: njsConnection_executeAsync (in /home/ubuntu/hkfps/oepy-outbound-payment-all/node_modules/oracledb/build/Release/oracledb-5.4.0-linux-x64.node)
==36361== by 0x11819C24: njsBaton_executeAsync (in /home/ubuntu/hkfps/oepy-outbound-payment-all/node_modules/oracledb/build/Release/oracledb-5.4.0-linux-x64.node)
==36361== by 0x163C973: worker (threadpool.c:122)
==36361== by 0x58B2B42: start_thread (pthread_create.c:442)
==36361== by 0x5943BB3: clone (clone.S:100)
=36361== LEAK SUMMARY:
==36361== definitely lost: 32,256 bytes in 2 blocks
==36361== indirectly lost: 871,351 bytes in 975 blocks
==36361== possibly lost: 2,500,246 bytes in 1,397 blocks
==36361== still reachable: 485,525,664 bytes in 30,537 blocks
==36361== of which reachable via heuristic:
==36361== multipleinheritance: 48 bytes in 1 blocks
==36361== suppressed: 0 bytes in 0 blocks
Valgrind command which was used
valgrind --track-origins=yes --leak-check=full --trace-children=yes --log-file=out_%p.log --demangle=yes --error-limit=no --show-leak-kinds=all node test5.js
I am trying to reproduc a problem .
My c code giving SIGABRT , i traced it back to this line number :3174
https://elixir.bootlin.com/glibc/glibc-2.27/source/malloc/malloc.c
/* Little security check which won't hurt performance: the allocator
never wrapps around at the end of the address space. Therefore
we can exclude some size values which might appear here by
accident or by "design" from some intruder. We need to bypass
this check for dumped fake mmap chunks from the old main arena
because the new malloc may provide additional alignment. */
if ((__builtin_expect ((uintptr_t) oldp > (uintptr_t) -oldsize, 0)
|| __builtin_expect (misaligned_chunk (oldp), 0))
&& !DUMPED_MAIN_ARENA_CHUNK (oldp))
malloc_printerr ("realloc(): invalid pointer");
My understanding is that when i call calloc function memory get allocated when I call realloc function and try to increase memory area ,heap is not available for some reason giving SIGABRT
My another question is, How can I limit the heap area to some bytes say, 10 bytes to replicate the problem. In stackoverflow RSLIMIT and srlimit is mentioned but no sample code is mentioned. Can you provide sample code where heap size is 10 Bytes ?
How can I limit the heap area to some bytes say, 10 bytes
Can you provide sample code where heap size is 10 Bytes ?
From How to limit heap size for a c code in linux , you could do:
You could use (inside your program) setrlimit(2), probably with RLIMIT_AS (as cited by Ouah's answer).
#include <sys/resource.h>
int main() {
setrlimit(RLIMIT_AS, &(struct rlimit){10,10});
}
Better yet, make your shell do it. With bash it is the ulimit builtin.
$ ulimit -v 10
$ ./your_program.out
to replicate the problem
Most probably, limiting heap size will result in a different problem related to heap size limit. Most probably it is unrelated, and will not help you to debug the problem. Instead, I would suggest to research address sanitizer and valgrind.
On inspecting a crash dump file for an out of memory exception reported by a client the results of !DumpHeap -stat showed that 575MB of memory is being taken up by 45,000 objects of type "Free" most of which I assume would have to reside in Gen 2 due to the size.
The first places I looked for problems were the large object heap (LOH) and pinned objects. The large object heap with free space included was only 70MB so that wasn't the issue and running !gchandles showed:
GC Handle Statistics:
Strong Handles: 155
Pinned Handles: 265
Async Pinned Handles: 8
Ref Count Handles: 163
Weak Long Handles: 0
Weak Short Handles: 0
Other Handles: 0
which is a very small number of handles (around 600) compared to the number of free objects (45,000). To me this rules out the free blocks being caused by pinning.
I also looked into the free blocks themselves to see if maybe they had a consistent size, but on inspection the sizes varied widely and went from just short of 5MB to only around 12 bytes or so.
Any help would be appreciated! I am at a loss since there is fragmentation but no signs of it being cause by the two places that I know to look which are the large object heap (LOH) and pinned handles.
In which generation are the free objects?
I assume would have to reside in Gen 2 due to the size
Size is not related to generations. To find out in which generation the free blocks reside, you can follow these steps:
From !dumpheap -stat -type Free get the method table:
0:003> !dumpheap -stat -type Free
total 7 objects
Statistics:
MT Count TotalSize Class Name
00723538 7 100 Free
Total 7 objects
From !eeheap -gc, get the start addresses of the generations.
0:003> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x026a1018
generation 1 starts at 0x026a100c
generation 2 starts at 0x026a1000
ephemeral segment allocation context: none
segment begin allocated size
026a0000 026a1000 02731ff4 0x00090ff4(593908)
Large object heap starts at 0x036a1000
segment begin allocated size
036a0000 036a1000 036a3250 0x00002250(8784)
Total Size 0x93244(602692)
------------------------------
GC Heap Size 0x93244(602692)
Then dump the free objects only of a specific generation by passing the start and end address (e.g. of generation 2 here):
0:003> !dumpheap -mt 00723538 0x026a1000 0x026a100c
Address MT Size
026a1000 00723538 12 Free
026a100c 00723538 12 Free
total 2 objects
Statistics:
MT Count TotalSize Class Name
00723538 2 24 Free
Total 2 objects
So in my simple case, there are 2 free objects in generation 2.
Pinned handles
600 pinned objects should not cause 45.000 free memory blocks. Still from my experience, 600 pinned handles are a lot. But first, check in which generation the free memory blocks reside.
I'm currently learning heap exploitation and overflows. But currently i'm stuck with some issue.
I'm using glibc v2.19 , when i try to examine the heap memory with gdb.
Here is the code that i'm using.
char *p;
p = malloc(20);
strcpy(p,"AAAAAAAA");
free(p);
This is the gdb output for the allocated chunck.
(gdb) x/4xw 0x0804b008-8
0x804b000: 0x00000000 0x00000019 0x41414141 0x41414141
This is the gdb output for the unallocated chunck.
(gdb) x/4xw 0x0804b008-8
0x804b000: 0x00000000 0x00000019 0x00000000 0x41414141
I can see that the size field do not change after deallocation, so how this could be and how the allocator know that is this a free chunck or not? because as i know the first 3 bits of the size is used as status flags.
Why the next pointer "0x0000000" is not pointering any where ?
And why the pointer to prevoius chunck "0x41414141" is not used?
And how in this situation the allocator looks for free chuncks?
i also have tried to free two continious chuncks and after i freed the second one the prevoius size field which should be used as the size of the prevoius free chunk is not also used it is just zerod.
The manual page told me so much and through it I know lots of the background knowledge of memory management of "glibc".
But I still get confused. does "malloc_trim(0)"(note zero as the parameter) mean (1.)all the memory in the "heap" section will be returned to OS ? Or (2.)just all "unused" memory of the top most region of the heap will be returned to OS ?
If the answer is (1.), what if the still used memory in the heap? if the heap has used momery at places somewhere, will them be eliminated, or the function wouldn't execute successfully?
While if the answer is (2.), what about those "holes" at places rather than top in the heap? they're unused memory anymore, but the top most region of the heap is still used, will this calling work efficiently?
Thanks.
Man page of malloc_trim was committed here: https://github.com/mkerrisk/man-pages/blob/master/man3/malloc_trim.3 and as I understand, it was written by man-pages project maintainer, kerrisk in 2012 from scratch: https://github.com/mkerrisk/man-pages/commit/a15b0e60b297e29c825b7417582a33e6ca26bf65
As I can grep the glibc's git, there are no man pages in the glibc, and no commit to malloc_trim manpage to document this patch. The best and the only documentation of glibc malloc is its source code: https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/malloc.c
There are malloc_trim comments from malloc/malloc.c:
Additional functions:
malloc_trim(size_t pad);
609 /*
610 malloc_trim(size_t pad);
611
612 If possible, gives memory back to the system (via negative
613 arguments to sbrk) if there is unused memory at the `high' end of
614 the malloc pool. You can call this after freeing large blocks of
615 memory to potentially reduce the system-level memory requirements
616 of a program. However, it cannot guarantee to reduce memory. Under
617 some allocation patterns, some large free blocks of memory will be
618 locked between two used chunks, so they cannot be given back to
619 the system.
620
621 The `pad' argument to malloc_trim represents the amount of free
622 trailing space to leave untrimmed. If this argument is zero,
623 only the minimum amount of memory to maintain internal data
624 structures will be left (one page or less). Non-zero arguments
625 can be supplied to maintain enough trailing space to service
626 future expected allocations without having to re-obtain memory
627 from the system.
628
629 Malloc_trim returns 1 if it actually released any memory, else 0.
630 On systems that do not support "negative sbrks", it will always
631 return 0.
632 */
633 int __malloc_trim(size_t);
634
Freeing from the middle of the chunk is not documented as text in malloc/malloc.c and not documented in man-pages project. Man page from 2012 may be the first man page of the function, written not by authors of glibc. Info page of glibc only mentions M_TRIM_THRESHOLD of 128 KB:
https://www.gnu.org/software/libc/manual/html_node/Malloc-Tunable-Parameters.html#Malloc-Tunable-Parameters and don't list malloc_trim function https://www.gnu.org/software/libc/manual/html_node/Summary-of-Malloc.html#Summary-of-Malloc (and it also don't document memusage/memusagestat/libmemusage.so).
In december 2007 there was commit https://sourceware.org/git/?p=glibc.git;a=commit;f=malloc/malloc.c;h=68631c8eb92ff38d9da1ae34f6aa048539b199cc by Ulrich Drepper (it is part of glibc 2.9 and newer) which changed mtrim implementation (but it didn't change any documentation or man page as there are no man pages in glibc):
malloc/malloc.c (public_mTRIm): Iterate over all arenas and call
mTRIm for all of them.
(mTRIm): Additionally iterate over all free blocks and use madvise
to free memory for all those blocks which contain at least one
memory page.
Unused parts of chunks (anywhere, including chunks in the middle), aligned on page size and having size more than page may be marked as MADV_DONTNEED https://sourceware.org/git/?p=glibc.git;a=blobdiff;f=malloc/malloc.c;h=c54c203cbf1f024e72493546221305b4fd5729b7;hp=1e716089a2b976d120c304ad75dd95c63737ad75;hb=68631c8eb92ff38d9da1ae34f6aa048539b199cc;hpb=52386be756e113f20502f181d780aecc38cbb66a
INTERNAL_SIZE_T size = chunksize (p);
if (size > psm1 + sizeof (struct malloc_chunk))
{
/* See whether the chunk contains at least one unused page. */
char *paligned_mem = (char *) (((uintptr_t) p
+ sizeof (struct malloc_chunk)
+ psm1) & ~psm1);
assert ((char *) chunk2mem (p) + 4 * SIZE_SZ <= paligned_mem);
assert ((char *) p + size > paligned_mem);
/* This is the size we could potentially free. */
size -= paligned_mem - (char *) p;
if (size > psm1)
madvise (paligned_mem, size & ~psm1, MADV_DONTNEED);
}
This is one of total two usages of madvise with MADV_DONTNEED in glibc now, one for top part of heaps (shrink_heap) and other is marking of any chunk (mtrim): http://code.metager.de/source/search?q=MADV_DONTNEED&path=%2Fgnu%2Fglibc%2Fmalloc%2F&project=gnu
H A D arena.c 643 __madvise ((char *) h + new_size, diff, MADV_DONTNEED);
H A D malloc.c 4535 __madvise (paligned_mem, size & ~psm1, MADV_DONTNEED);
We can test the malloc_trim with this simple C program (test_malloc_trim.c) and strace/ltrace:
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <malloc.h>
int main()
{
int *m1,*m2,*m3,*m4;
printf("%s\n","Test started");
m1=(int*)malloc(20000);
m2=(int*)malloc(40000);
m3=(int*)malloc(80000);
m4=(int*)malloc(10000);
// check that all arrays are allocated on the heap and not with mmap
printf("1:%p 2:%p 3:%p 4:%p\n", m1, m2, m3, m4);
// free 40000 bytes in the middle
free(m2);
// call trim (same result with 2000 or 2000000 argument)
malloc_trim(0);
// call some syscall to find this point in the strace output
sleep(1);
free(m1);
free(m3);
free(m4);
// malloc_stats(); malloc_info(0, stdout);
return 0;
}
gcc test_malloc_trim.c -o test_malloc_trim, strace ./test_malloc_trim
write(1, "Test started\n", 13Test started
) = 13
brk(0) = 0xcca000
brk(0xcef000) = 0xcef000
write(1, "1:0xcca010 2:0xccee40 3:0xcd8a90"..., 441:0xcca010 2:0xccee40 3:0xcd8a90 4:0xcec320
) = 44
madvise(0xccf000, 36864, MADV_DONTNEED) = 0
...
nanosleep({1, 0}, 0x7ffffafbfff0) = 0
brk(0xceb000) = 0xceb000
So, there was madvise with MADV_DONTNEED for 9 pages after malloc_trim(0) call, when there was hole of 40008 bytes in the middle of the heap.
The man page for malloc_trim says it releases free memory, so if there is allocated memory in the heap, it won't release the whole heap. The parameter is there if you know you're still going to need a certain amount of memory, so freeing more than that would cause glibc to have to do unnecessary work later.
As for holes, this is a standard problem with memory management and returning memory to the OS. The primary low-level heap management available to the program is brk and sbrk, and all they can do is extend or shrink the heap area by changing the top. So there's no way for them to return holes to the operating system; once the program has called sbrk to allocate more heap, that space can only be returned if the top of that space is free and can be handed back.
Note that there are other, more complex ways to allocate memory (with anonymous mmap, for example), which may have different constraints than sbrk-based allocation.