Jvm is using more memory than Native Memory Tracking says, how do I locate where the extra meeory goes? [duplicate] - memory-leaks

This question already has answers here:
Java process memory usage (jcmd vs pmap)
(3 answers)
Where do these java native memory allocated from?
(1 answer)
Closed 5 years ago.
I'm running jetty on my web server. My current jvm setting: -Xmx4g -Xms2g, however jetty uses a lot more memory and I don't know where these extra memory goes.
Jetty uses 4.547g memory in total:
heap usage shows heap memory usage at 2.5g:
Heap Usage:
New Generation (Eden + 1 Survivor Space):
capacity = 483196928 (460.8125MB)
used = 277626712 (264.76546478271484MB)
free = 205570216 (196.04703521728516MB)
57.45622455612963% used
Eden Space:
capacity = 429522944 (409.625MB)
used = 251267840 (239.627685546875MB)
free = 178255104 (169.997314453125MB)
58.4992824038755% used
From Space:
capacity = 53673984 (51.1875MB)
used = 26358872 (25.137779235839844MB)
free = 27315112 (26.049720764160156MB)
49.109214624351345% used
To Space:
capacity = 53673984 (51.1875MB)
used = 0 (0.0MB)
free = 53673984 (51.1875MB)
0.0% used
concurrent mark-sweep generation:
capacity = 2166849536 (2066.46875MB)
used = 1317710872 (1256.6670150756836MB)
free = 849138664 (809.8017349243164MB)
60.81229222922842% used
Still 2g missing, then I use Native Memory Tracking, it shows:
Total: reserved=5986478KB, committed=3259678KB
- Java Heap (reserved=4194304KB, committed=2640352KB)
(mmap: reserved=4194304KB, committed=2640352KB)
- Class (reserved=1159154KB, committed=122778KB)
(classes #18260)
(malloc=4082KB #62204)
(mmap: reserved=1155072KB, committed=118696KB)
- Thread (reserved=145568KB, committed=145568KB)
(thread #141)
(stack: reserved=143920KB, committed=143920KB)
(malloc=461KB #707)
(arena=1187KB #280)
- Code (reserved=275048KB, committed=143620KB)
(malloc=25448KB #30875)
(mmap: reserved=249600KB, committed=118172KB)
- GC (reserved=25836KB, committed=20792KB)
(malloc=11492KB #1615)
(mmap: reserved=14344KB, committed=9300KB)
- Compiler (reserved=583KB, committed=583KB)
(malloc=453KB #769)
(arena=131KB #3)
- Internal (reserved=76399KB, committed=76399KB)
(malloc=76367KB #25878)
(mmap: reserved=32KB, committed=32KB)
- Symbol (reserved=21603KB, committed=21603KB)
(malloc=17791KB #201952)
(arena=3812KB #1)
- Native Memory Tracking (reserved=5096KB, committed=5096KB)
(malloc=22KB #261)
(tracking overhead=5074KB)
- Arena Chunk (reserved=190KB, committed=190KB)
(malloc=190KB)
- Unknown (reserved=82696KB, committed=82696KB)
(mmap: reserved=82696KB, committed=82696KB)
Still does't explain where memory goes, can someone shed light on how to locate the missing memory?

Related

How to Fully Utilize CPU cores for skopt.forest_minimize

So I have the following code for running skopt.forest_minimize(), but the biggest challenge I am facing right now is that it is taking upwards of days to finish running even just 2 iterations.
SPACE = [skopt.space.Integer(4, max_neighbour, name='n_neighbors', prior='log-uniform'),
skopt.space.Integer(6, 10, name='nr_cubes', prior='uniform'),
skopt.space.Categorical(overlap_cat, name='overlap_perc')]
#skopt.utils.use_named_args(SPACE)
def objective(**params):
score, scomp = tune_clustering(X_cont=X_cont, df=df, pl_brewer=pl_brewer, **params)
if score == 0:
print('saving new scomp')
with open(scomp_file, 'w') as filehandle:
json.dump(scomp, filehandle, default = json_default)
return score
results = skopt.forest_minimize(objective, SPACE, n_calls=1, n_initial_points=1, callback=[scoring])
Is it possible to optimize the following code so that it can compute faster? I noticed that it was barely making use of my CPU, highest CPU utilized is about 30% (it's i7 9th gen with
8 cores).
Also a question while I'm at it, is it possible to utilize a GPU for these computational tasks? I have a 3050 that I can use.

Is there a string size limit when feeding .fromstring() method as input?

I'm working on multiple well-formed xml files, whose sizes range from 100 MB to 4 GB. My goal is to read them as strings and then import them as ElementTree objects using .fromstring() method (from xml.etree.ElementTree module).
However, as the process goes through and the string size increases, two exceptions occured related to memory restriction :
xml.etree.ElementTree.ParseError: out of memory: line 1, column 0
OverflowError: size does not fit in an int
It looks like .fromstring() method enforces a string size limit to the input, around 1GB... ?
To debug this, I wrote a short script using a for loop:
xmlFiles_list = [path1, path2, ...]
for fp in xmlFiles_list:
xml_fo = open(fp, mode='r', encoding="utf-8")
xml_asStr = xml_fo.read()
xml_fo.close()
print(len(xml_asStr.encode("utf-8")) / 10**9) # display string size in GB
try:
etree = cElementTree.fromstring(xml_asStr)
print(".fromstring() success!\n")
except Exception as e:
print(f"Error :{type(e)} {str(e)}\n")
continue
The ouput is as following :
0.895206753
.fromstring() success!
1.220224531
Error :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
1.328233473
Erreur :<class 'xml.etree.ElementTree.ParseError'> out of memory: line 1, column 0
2.567867904
Error :<class 'OverflowError'> size does not fit in an int
4.080672538
Error :<class 'OverflowError'> size does not fit in an int
I found multiple workarounds to avoid this issue : .parse() method or lxml module for bette performance. I just hope someone could shed some light on this :
Is there a specific string size limit in xml.etree.ET module and .fromstring() method ?
Why do I end up with two different exceptions as the string size increases ? Are they related to the same memory-allocation restriction ?
Python version/system: 3.9 (64 bits)
RAM : 32go
Hope my topic is clear enough, I'm new on stackoverflow

RuntimeError on running ALBERT for obtaining encoding vectors from text

I’m trying to get feature vectors from the encoder model using pre-trained ALBERT v2 weights. i have a nvidia 1650ti gpu (4 GB) , and sufficient RAM(8GB) but for some reason I’m getting Runtime error saying -
RuntimeError: [enforce fail at …\c10\core\CPUAllocator.cpp:75] data.
DefaultCPUAllocator: not enough memory: you tried to allocate
491520000 bytes. Buy new RAM!
I’m really new to pytorch and deep learning in general. Can anyone please tell me what is wrong?
My entire code -
encoded_test_data = tokenized_test_values[‘input_ids’]
encoded_test_masks = tokenized_test_values[‘attention_mask’]
encoded_train_data = torch.from_numpy(encoded_train_data).to(device)
encoded_masks = torch.from_numpy(encoded_masks).to(device)
encoded_test_data = torch.from_numpy(encoded_test_data).to(device)
encoded_test_masks = torch.from_numpy(encoded_test_masks).to(device)
config = EncoderDecoderConfig.from_encoder_decoder_configs(BertConfig(),BertConfig())
EnD_model = EncoderDecoderModel.from_pretrained(‘albert-base-v2’,config=config)
feature_extractor = EnD_model.get_encoder()
feature_vector = feature_extractor.forward(input_ids=encoded_train_data,attention_mask = encoded_masks)
feature_test_vector = feature_extractor.forward(input_ids = encoded_test_data, attention_mask = encoded_test_masks)
Also 491520000 bytes is about 490 MB which should not be a problem.
I tried reducing the number of training examples and also the length of the maximum padded input . The OOM error still exists even though the required space now is 153 MB , which should easily be managable.
I also have maxed out the RAM limit of the heap of pycharm software to 2048 MB. I really dont know what to do now…

linux softlockup in memory

My system is a embedded linux system(running kernel version 2.6.18). A client process send data to mysql server. The data will be stored in mysql database at a RAID5 assembled by four disks. The IO pressure(wa%) is always above 20% , mysql CPU utilization is very high.
After running 5 or 6 hours, the system run into softlock up stat.
The stack information is about releasing the physical memory, writing cache data to the hard disk.
Any suggestions in this circumstance?**
BUG: soft lockup detected on CPU#0!
[<c043dc1c>] softlockup_tick+0x8f/0xb1
[<c0428cb5>] update_process_times+0x26/0x5c
[<c0411256>] smp_apic_timer_interrupt+0x5d/0x67
[<c04044e7>] apic_timer_interrupt+0x1f/0x24
[<c06fe0b9>] _spin_lock+0x5/0xf
[<c047db2a>] __mark_inode_dirty+0x50/0x176
[<c0424eef>] current_fs_time+0x4d/0x5e
[<c0475ccd>] touch_atime+0x51/0x94
[<c0440926>] do_generic_mapping_read+0x425/0x563
[<c044134b>] __generic_file_aio_read+0xf3/0x267
[<c043fcd0>] file_read_actor+0x0/0xd4
[<c04414fb>] generic_file_aio_read+0x3c/0x4d
[<c045d72d>] do_sync_read+0xc1/0xfd
[<c0431656>] autoremove_wake_function+0x0/0x37
[<c045e0e8>] vfs_read+0xa4/0x167
[<c045d66c>] do_sync_read+0x0/0xfd
[<c045e688>] sys_pread64+0x5e/0x62
[<c0403a27>] syscall_call+0x7/0xb
=======================
BUG: soft lockup detected on CPU#2!
[<c043dc1c>] softlockup_tick+0x8f/0xb1
[<c0428cb5>] update_process_times+0x26/0x5c
[<c0411256>] smp_apic_timer_interrupt+0x5d/0x67
[<c04044e7>] apic_timer_interrupt+0x1f/0x24
[<c06fe0bb>] _spin_lock+0x7/0xf
[<c04aaf17>] journal_try_to_free_buffers+0xf4/0x17b
[<c0442c52>] find_get_pages+0x28/0x5d
[<c049c4b1>] ext3_releasepage+0x0/0x7d
[<c045f0bf>] try_to_release_page+0x2c/0x46
[<c0447894>] invalidate_mapping_pages+0xc9/0x167
[<c04813b0>] drop_pagecache+0x86/0xd2
[<c048144e>] drop_caches_sysctl_handler+0x52/0x64
[<c04813fc>] drop_caches_sysctl_handler+0x0/0x64
[<c042623d>] do_rw_proc+0xe8/0xf4
[<c0426268>] proc_writesys+0x1f/0x24
[<c045df81>] vfs_write+0xa6/0x169
[<c0426249>] proc_writesys+0x0/0x24
[<c045e601>] sys_write+0x41/0x6a
[<c0403a27>] syscall_call+0x7/0xb
=======================
BUG: soft lockup detected on CPU#1!
[<c043dc1c>] softlockup_tick+0x8f/0xb1
[<c0428cb5>] update_process_times+0x26/0x5c
[<c0411256>] smp_apic_timer_interrupt+0x5d/0x67
[<c04044e7>] apic_timer_interrupt+0x1f/0x24
[<c06f007b>] inet_diag_dump+0x804/0x821
[<c06fe0bb>] _spin_lock+0x7/0xf
[<c047db2a>] __mark_inode_dirty+0x50/0x176
[<c043168d>] wake_bit_function+0x0/0x3c
[<c04ae0f6>] __journal_remove_journal_head+0xee/0x1a5
[<c0445ae8>] __set_page_dirty_nobuffers+0x87/0xc6
[<c04a908e>] __journal_unfile_buffer+0x8/0x11
[<c04ab94d>] journal_commit_transaction+0x8e0/0x1103
[<c0431656>] autoremove_wake_function+0x0/0x37
[<c04af690>] kjournald+0xa9/0x1e5
[<c0431656>] autoremove_wake_function+0x0/0x37
[<c04af5e7>] kjournald+0x0/0x1e5
[<c04314da>] kthread+0xde/0xe2
[<c04313fc>] kthread+0x0/0xe2
[<c0404763>] kernel_thread_helper+0x7/0x14
=======================
BUG: soft lockup detected on CPU#3!
[<c043dc1c>] softlockup_tick+0x8f/0xb1
[<c0428cb5>] update_process_times+0x26/0x5c
[<c0411256>] smp_apic_timer_interrupt+0x5d/0x67
[<c04044e7>] apic_timer_interrupt+0x1f/0x24
[<c06fe0bb>] _spin_lock+0x7/0xf
[<c047db2a>] __mark_inode_dirty+0x50/0x176
[<c0424eef>] current_fs_time+0x4d/0x5e
[<c0475ccd>] touch_atime+0x51/0x94
[<c0440926>] do_generic_mapping_read+0x425/0x563
[<c044134b>] __generic_file_aio_read+0xf3/0x267
[<c043fcd0>] file_read_actor+0x0/0xd4
[<c04414fb>] generic_file_aio_read+0x3c/0x4d
[<c045d72d>] do_sync_read+0xc1/0xfd
[<c0431656>] autoremove_wake_function+0x0/0x37
[<c045e0e8>] vfs_read+0xa4/0x167
[<c045d66c>] do_sync_read+0x0/0xfd
[<c045e688>] sys_pread64+0x5e/0x62
[<c0403a27>] syscall_call+0x7/0xb
=======================
Seriously, try something newer. 2.6.18 is >7 years old.
Looks like CPU#1 and CPU#3 are spinning in a spinlock on a inode structure.

Memory leak at Microsoft DTV-DVD Video decoder?

During rendering h.264 at AVI container video file memory consumption of my application rising with big speed, aroud 150 Mb/min.
This is the link for image of my graph: http://picturepush.com/public/8926555
If using LAV video decoder insted - no memory leak.
First I suggest, that leak happened at my code, but than i just switch off (set "return S_OK" at the start of callback") both my sample grabbers filter - the leak continue.
Also i tried to relese every filter after stop graph like this, but this no delete leak:
if(m_pMediaControl)
{
HRESULT hr = m_pMediaControl->Stop();
LONG lCount;
IUnknown* pUnk;
IAMCollection* p_Collection;
hr = m_pMediaControl->get_FilterCollection(reinterpret_cast<IDispatch**>(&p_Collection));
hr = p_Collection->get_Count(&lCount);
for (int i=0; i<lCount; i++)
{
hr = p_Collection->Item(i, &pUnk);
pUnk->Release();
}
p_Collection->Release();
}
m_pMediaControl.Release();
Will be happy for any suugestions, how to eliminate memory leak?
I create different graphs at graphedit and observed for repetead playback short (6 sec) h.264 video file:
picturepush.com/public/8931745 - Full graph - +6 Mb grow up Private bytes every time after playback
picturepush.com/public/8931760 - With DMO convertor, Without samplegrabber - no memory leak
picturepush.com/public/8931766 - With DMO convertor, Without samplegrabber, but with video renderer - +7 Mb grow up Private bytes every time after playback
picturepush.com/public/8931770 - Only decoder - no memory leak

Resources