memory leak for java off heap when using kafka streaming - memory-leaks

i using kafka streaming to join streams,but the off heap memory is out of control.
i use jemalloc to find the cause.
first, the rocksdb use offheap memory in high percentage
1806344172 67.4% 67.4% 1806344172 67.4% rocksdb::BlockFetcher::ReadBlockContents
588400270 22.0% 89.4% 588400270 22.0% os::malloc#921040
132120590 4.9% 94.3% 132120590 4.9% rocksdb::Arena::AllocateNewBlock
50331648 1.9% 96.2% 50331648 1.9% init
17587683 0.7% 96.8% 17981107 0.7% rocksdb::VersionSet::ProcessManifestWrites
15688131 0.6% 97.4% 15688131 0.6% rocksdb::WritableFileWriter::WritableFileWriter
12943699 0.5% 97.9% 12943699 0.5% rocksdb::port::cacheline_aligned_alloc
11800800 0.4% 98.4% 12588000 0.5% rocksdb::LRUCacheShard::Insert
8784504 0.3% 98.7% 1811954485 67.6% rocksdb::BlockBasedTable::PartitionedIndexIteratorState::NewSecondaryIterator
7606272 0.3% 99.0% 7606272 0.3% rocksdb::LRUHandleTable::Resize
As time goes by, it was changed
Total: 4502654593 B
3379447055 75.1% 75.1% 3379447055 75.1% os::malloc#921040
620666890 13.8% 88.8% 620666890 13.8% rocksdb::BlockFetcher::ReadBlockContents
142606352 3.2% 92.0% 142606352 3.2% rocksdb::Arena::AllocateNewBlock
129603986 2.9% 94.9% 129603986 2.9% rocksdb::port::cacheline_aligned_alloc
67797317 1.5% 96.4% 67797317 1.5% rocksdb::LRUHandleTable::Resize
50331648 1.1% 97.5% 50331648 1.1% init
32501412 0.7% 98.2% 230760042 5.1% Java_org_rocksdb_Options_newOptions__
18600150 0.4% 98.6% 19255895 0.4% rocksdb::VersionSet::ProcessManifestWrites
16393216 0.4% 99.0% 16393216 0.4% rocksdb::WritableFileWriter::WritableFileWriter
5629242 0.1% 99.1% 5629242 0.1% updatewindow
os::malloc#921040 comsume most of the memory, and always growthing
so anyone can give some help ?

Related

Why does Node.js slow down during iterations?

I made simple program (code below) it make two loop. First loop makes 5900000000 iteration without any complicated calculation. The second loop counts the time of this first. After 3 to 5 iteration outside loop, the time inner loop take much much larger than earlier (result also below).
I tested it on two environment my local laptop MacBook Pro, and AWS EC2.
Do you have idea why it work that way, and what causing slow down?
const calculate = () => {
let a = 0;
for (let i = 0; i < 5900000000; ++i) {
a++
}
};
for (let i = 0; i < 10; ++i) {
console.time();
calculate();
console.timeEnd();
}
Result:
default: 4.875s
default: 5.566s
default: 29.625s
default: 29.805s
default: 29.698s
default: 29.595s
default: 29.733s
default: 29.611s
default: 29.597s
default: 29.476s
Also, I run app in with profilem mode to saw what happen: NODE_ENV=production node --prof src/index.js
Result:
Statistical profiling result from isolate-0x148040000-5612-v8.log, (81904 ticks, 2 unaccounted, 0 excluded).
[Shared libraries]:
ticks total nonlib name
9 0.0% /usr/lib/system/libsystem_pthread.dylib
5 0.0% /usr/lib/system/libsystem_c.dylib
4 0.0% /usr/lib/libc++.1.dylib
2 0.0% /usr/lib/system/libsystem_malloc.dylib
2 0.0% /usr/lib/system/libsystem_kernel.dylib
[JavaScript]:
ticks total nonlib name
81837 99.9% 99.9% LazyCompile: *calculate /Users/jjuszkiewicz/workspace/nodejs/src/index.js:1:19
[C++]:
ticks total nonlib name
19 0.0% 0.0% T node::contextify::CompiledFnEntry::WeakCallback(v8::WeakCallbackInfo<node::contextify::CompiledFnEntry> const&)
17 0.0% 0.0% T node::builtins::BuiltinLoader::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
5 0.0% 0.0% T _semaphore_destroy
1 0.0% 0.0% t std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long)
1 0.0% 0.0% T _mach_port_allocate
[Summary]:
ticks total nonlib name
81837 99.9% 99.9% JavaScript
43 0.1% 0.1% C++
0 0.0% 0.0% GC
22 0.0% Shared libraries
2 0.0% Unaccounted
[C++ entry points]:
ticks cpp total name
20 52.6% 0.0% T node::contextify::CompiledFnEntry::WeakCallback(v8::WeakCallbackInfo<node::contextify::CompiledFnEntry> const&)
17 44.7% 0.0% T node::builtins::BuiltinLoader::CompileFunction(v8::FunctionCallbackInfo<v8::Value> const&)
1 2.6% 0.0% t std::__1::basic_ostream<char, std::__1::char_traits<char> >& std::__1::__put_character_sequence<char, std::__1::char_traits<char> >(std::__1::basic_ostream<char, std::__1::char_traits<char> >&, char const*, unsigned long)
[Bottom up (heavy) profile]:
Note: percentage shows a share of a particular caller in the total
amount of its parent calls.
Callers occupying less than 1.0% are not shown.
ticks parent name
81837 99.9% LazyCompile: *calculate /Users/jjuszkiewicz/workspace/nodejs/src/index.js:1:19
81837 100.0% Function: ~<anonymous> /Users/jjuszkiewicz/workspace/nodejs/src/index.js:1:1
81837 100.0% LazyCompile: ~Module._compile node:internal/modules/cjs/loader:1173:37
81837 100.0% LazyCompile: ~Module._extensions..js node:internal/modules/cjs/loader:1227:37
81837 100.0% LazyCompile: ~Module.load node:internal/modules/cjs/loader:1069:33
81837 100.0% LazyCompile: ~Module._load node:internal/modules/cjs/loader:851:24

Index/Match Array Not Returning Some Combinations Correctly

I am using an index/match formula as an array (provided by an awesome person on another forum) to return a specific result using 3 criteria, I just realized that a particular combination is not returning properly. The formula should return the cell that these three criteria intersect:
BM96 (LTV) LTV Data Range is AZ60:AZ69
BM97 (Coverage) Coverage Data Range is BA60:BA69
BM98 (Credit Score) Credit Score Range is BB59:BI59
Data Table to return value from is BB60:BI69. Here is the formula:
=IFERROR(INDEX(BB60:BI69, MATCH(IF(BM96>95%, 97%, BM96)&BM97, AZ60:AZ69&BA60:BA69,-1), MATCH(IF(BM98>760, 760,BM98), BB59:BI59,-1)), INDEX(BB60:BI69, MATCH(IF(BM96>95%, 97%, BM96)&BM97, AZ60:AZ69&BA60:BA69, -1),MATCH(IF(BM98>760, 760, BM98), BB59:BI59,-1)))*100
(I am pressing Ctrl+Sht+Enter) when completing formula
When using the following combination of criteria, the result should be .96 but instead it returns .72 which is the correct column but 2 rows lower than it should be
LTV 92%
Coverage 30%
Credit Score 680
Here is the data:
AZ BA BB BC BD BE BF BG BH BI
59 LTV Coverage 760 759 739 719 699 679 659 639
60 97% 35% 0.58% 0.70% 0.87% 0.99% 1.21% 1.54% 1.65% 1.86%
61 97% 25% 0.46% 0.58% 0.70% 0.79% 0.98% 1.23% 1.31% 1.50%
62 97% 18% 0.39% 0.51% 0.61% 0.68% 0.85% 1.05% 1.17% 1.27%
63 95% 30% 0.38% 0.53% 0.66% 0.78% 0.96% 1.28% 1.33% 1.42%
64 95% 25% 0.34% 0.48% 0.59% 0.68% 0.87% 1.11% 1.19% 1.25%
65 95% 16% 0.30% 0.40% 0.48% 0.58% 0.72% 0.95% 1.04% 1.13%
66 90% 25% 0.28% 0.38% 0.46% 0.55% 0.65% 0.90% 0.91% 0.94%
67 90% 12% 0.22% 0.27% 0.32% 0.39% 0.46% 0.62% 0.65% 0.73%
68 85% 12% 0.19% 0.20% 0.23% 0.25% 0.28% 0.38% 0.40% 0.44%
69 85% 6% 0.17% 0.19% 0.22% 0.24% 0.27% 0.37% 0.39% 0.42%
I have tried lot's of things but can't seem to make this work, I know the issue is related to the LTV but can't understand why it is returning the row for 16% rather than 30%.
Any help would be greatly appreciated.
Try this alternate non-CSE formula using the newer AGGREGATE function.
=INDEX(BB60:BI69, AGGREGATE(14, 6, ROW(1:10)/((AZ60:AZ69>=MIN(AZ51, MAX(AZ60:AZ69)))*(BA60:BA69>=AZ52)), 1), IFERROR(MATCH(AZ53, BB59:BI59, -1), 1))

kmalloc-256 seems taking most of the memory resource. How can I free this?

I have a Linux instance (Amazon Linux Linux ip-xxx 4.9.20-11.31.amzn1.x86_64 #1) which runs Jenkins. It occasionally stops working because of the lack of the memory needed for a job.
Based on my investigation with free command and /proc/meminfo, it seems that Slab is taking up most of the memory available on the instance.
[root#ip-xxx ~]# free -tm
total used free shared buffers cached
Mem: 7985 7205 779 0 19 310
-/+ buffers/cache: 6876 1108
Swap: 0 0 0
Total: 7985 7205 779
[root#ip-xxx ~]# cat /proc/meminfo | grep "Slab\|claim"
Slab: 6719244 kB
SReclaimable: 34288 kB
SUnreclaim: 6684956 kB
I could find the way to purge dentry cache by running echo 3 > /proc/sys/vm/drop_caches, but how can I purge kmalloc-256? Or, is there a way to find which process is using kmalloc-256 memory space?
[root#ip-xxx ~]# slabtop -o | head -n 15
Active / Total Objects (% used) : 26805556 / 26816810 (100.0%)
Active / Total Slabs (% used) : 837451 / 837451 (100.0%)
Active / Total Caches (% used) : 85 / 111 (76.6%)
Active / Total Size (% used) : 6696903.08K / 6701323.05K (99.9%)
Minimum / Average / Maximum Object : 0.01K / 0.25K / 8.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
26658528 26658288 99% 0.25K 833079 32 6664632K kmalloc-256
21624 21009 97% 0.12K 636 34 2544K kernfs_node_cache
20055 20055 100% 0.19K 955 21 3820K dentry
10854 10646 98% 0.58K 402 27 6432K inode_cache
10624 9745 91% 0.03K 83 128 332K kmalloc-32
7395 7395 100% 0.05K 87 85 348K ftrace_event_field
6912 6384 92% 0.02K 27 256 108K kmalloc-16
6321 5581 88% 0.19K 301 21 1204K cred_jar

Creating a Cost Profile

Can anyone help me with this little problem. I will try and explain as bets i can!!
We have a P6 schedule exported into Excel (Dont ask why we dont just do this in P6) and a some estimates. We have a start and finish date for each activity from the schedule dump and the cost of each activity from the estimte. What we would like to be able to do is be able to spread these costs in a cost loaded curve. Simple enough if every activity will have the same cost profile. But they dont and here is the tricky bit. We would like to be able to select the profile 6 in all and then excel would do its magic and proportion the cost between the 2 dates according to the cost profile selected... Simples, I hope!!
Curve Profiles
1 10% 10% 10% 10% 10% 10% 10% 10% 10% 10% 100%
2 1% 2% 3% 7% 13% 17% 20% 19% 13% 5% 100%
3 5% 13% 19% 20% 17% 13% 7% 3% 2% 1% 100%
4 3% 7% 11% 14% 15% 15% 14% 11% 7% 3% 100%
5 100% 0% 0% 0% 0% 0% 0% 0% 0% 0% 100%
6 0% 0% 0% 0% 0% 0% 0% 0% 0% 100% 100%
P6 is Primavera is a planning tool. We are going to use polynomial formula to determine what the split is along the line for the total value. =3.10862446895044E-15*C16^6 + 0.0000128205126657122*C16^5 - 0.000384615378834496*C16^4 + 0.00211538450821536*C16^3 + 0.0173076931969263*C16^2 - 0.0324778592548682*C16 + 0.0136363733254257 We are struggling to work out how to get excel to determine the date set automatically. Below is a cut of the data.
Activity ID Activity Name Duration Start Finish Total Month Total Float Budgeted Total Cost
A1740 Major Permissions - Project Management 734 01-Apr-17 22-Feb-19 23 1939 £6,748,243
A1630 MPP2 - Main Site DCO Contracts 742 01-Apr-17 06-Mar-19 24 1931 £6,027,265
A1650 MPP3 - SP&C Contracts 553 01-Apr-17 08-Jun-18 15 2120 £299,795
A1660 MPP4 - Highways Contracts 443 01-Apr-17 29-Dec-17 9 2230 £881,005
A1670 MPP5 - Worker's Accomodation Contracts 445 01-Apr-17 03-Jan-18 10 2228 £920,193
A1690 MPP6 - Logistics, Park & Ride Contracts 746 01-Apr-17 12-Mar-19 24 1927 £581,667
A1720 MPP7 - Marine Licences Contracts 709 01-Apr-17 18-Jan-19 22 1964 £1,879,577
A1730 MPP8 - Environmental Permits Contracts 546 01-Apr-17 30-May-18 14 2127 £1,291,958
We did it!! Dont ask me how but someone with a much bigger brain then me worked out the formula. If anyone wants it just ask and i can send it over without the data just the formula. Here is a taster!!
Used for working out the time scale.
=IF(MONTH($D8)&YEAR($D8)=MONTH(I$2)&YEAR(I$2),(1-((DAY($D8)-1)/HLOOKUP(MONTH($D8),Monthdays,'Base Data'!$A$4,FALSE))),IF(MONTH($E8)&YEAR($E8)=MONTH(I$2)&YEAR(I$2),(DAY($E8))/HLOOKUP(MONTH($E8),Monthdays,'Base Data'!$A$4,FALSE)+H8,IF(OR(MONTH($E8)&YEAR($E8)=(MONTH(I$2)-1)&YEAR(I$2),MONTH($E8)&YEAR($E8)=12&YEAR(I$2)-1),0,IF(H8-G8>0,H8+1,H8))))
For the cost.
=IF(OR('Cumulative Cash Flow'!$G176="Linear",'Cumulative Cash Flow'!$G176="Project-S; Early",'Cumulative Cash Flow'!$G176="Project-S",'Cumulative Cash Flow'!$G176="Project-S; Late"),((VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,2,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176)^6+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,3,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176)^5+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,4,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176)^4+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,5,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176)^3+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,6,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176)^2+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,7,FALSE)*(11*'Time-Phase'!I176/'Cumulative Cash Flow'!$H176))
+VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,8,FALSE)*('Time-Phase'!I176/'Cumulative Cash Flow'!$H176))*'Cumulative Cash Flow'!$F176,IF(AND('Cumulative Cash Flow'!$G176="Upfront Payment",'Time-Phase'!I176>0,'Time-Phase'!I176<=1),'Cumulative Cash Flow'!$F176*VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,8,FALSE),IF(AND('Cumulative Cash Flow'!$G176="Final Payment",'Time-Phase'!I176='Cumulative Cash Flow'!$H176),'Cumulative Cash Flow'!$F176*VLOOKUP('Cumulative Cash Flow'!$G176,Curvetype,8,FALSE),0)))

Golang: What is etext?

I've started to profile some of my Go1.2 code and the top item is always something named 'etext'. I've searched around but couldn't find much information about it other than it might relate to call depth in Go routines. However, I'm not using any Go routines and 'etext' is still taking up 75% or more of the total execution time.
(pprof) top20
Total: 171 samples
128 74.9% 74.9% 128 74.9% etext
Can anybody explain what this is and if there is any way to reduce the impact?
I hit the same problem then I found this: pprof broken in go 1.2?. To verify that it is really a 1.2 bug I wrote the following "hello world" program:
package main
import (
"fmt"
"testing"
)
func BenchmarkPrintln( t *testing.B ){
TestPrintln( nil )
}
func TestPrintln( t *testing.T ){
for i := 0; i < 10000; i++ {
fmt.Println("hello " + " world!")
}
}
As you can see it only calls fmt.Println.
You can compile this with “go test –c .”
Run with “./test.test -test.bench . -test.cpuprofile=test.prof”
See the result with “go tool pprof test.test test.prof”
(pprof) top10
Total: 36 samples
18 50.0% 50.0% 18 50.0% syscall.Syscall
8 22.2% 72.2% 8 22.2% etext
4 11.1% 83.3% 4 11.1% runtime.usleep
3 8.3% 91.7% 3 8.3% runtime.futex
1 2.8% 94.4% 1 2.8% MHeap_AllocLocked
1 2.8% 97.2% 1 2.8% fmt.(*fmt).padString
1 2.8% 100.0% 1 2.8% os.epipecheck
0 0.0% 100.0% 1 2.8% MCentral_Grow
0 0.0% 100.0% 33 91.7% System
0 0.0% 100.0% 3 8.3% _/home/xxiao/work/test.BenchmarkPrintln
The above result is got using go 1.2.1
Then I did the same thing using go 1.1.1 and got the following result:
(pprof) top10
Total: 10 samples
2 20.0% 20.0% 2 20.0% scanblock
1 10.0% 30.0% 1 10.0% fmt.(*pp).free
1 10.0% 40.0% 1 10.0% fmt.(*pp).printField
1 10.0% 50.0% 2 20.0% fmt.newPrinter
1 10.0% 60.0% 2 20.0% os.(*File).Write
1 10.0% 70.0% 1 10.0% runtime.MCache_Alloc
1 10.0% 80.0% 1 10.0% runtime.exitsyscall
1 10.0% 90.0% 1 10.0% sweepspan
1 10.0% 100.0% 1 10.0% sync.(*Mutex).Lock
0 0.0% 100.0% 6 60.0% _/home/xxiao/work/test.BenchmarkPrintln
You can see that the 1.2.1 result does not make much sense. Syscall and etext takes most of the time. And the 1.1.1 result looks right.
So I'm convinced that it is really a 1.2.1 bug. And I switched to use go 1.1.1 in my real project and I'm satisfied with the profiling result now.
I think Mathias Urlichs is right regarding missing debugging symbols in your cgo code. Its worth noting that some standard pkgs like net and syscall make use of cgo.
If you scroll down to the bottom of this doc to the section called Caveats, you can see that the third bullet says...
If the program linked in a library that was not compiled with enough symbolic information, all samples associated with the library may be charged to the last symbol found in the program before the library. This will artificially inflate the count for that symbol.
I'm not 100% positive this is what's happening but i'm betting that this is why etext appears to be so busy (in other words etext is a collection of various functions that doesn't have enough information for pprof to analysis properly.).

Resources