Why Libgdx increasing heap continuously? - memory-leaks

Why heap going up with no reasons? if you said use gc, I'll still wondering WHY going up?
here is my code:
public class GameCore extends ApplicationAdapter {
SpriteBatch batch;
Texture texture;
StopWatch sw = new StopWatch();
long current, held;
#Override
public void create() {
batch = new SpriteBatch();
texture = new Texture(Gdx.files.internal("badlogic.jpg"));
sw.start();
}
#Override
public void render() {
Gdx.gl.glClearColor(1, 0, 0, 1);
Gdx.gl.glClear(GL20.GL_COLOR_BUFFER_BIT);
batch.begin();
batch.draw(texture, 0, 0);
batch.end();
if (Gdx.app.getJavaHeap() > held) {
current = Gdx.app.getJavaHeap();
held = current;
String f = "UP: " + Mis.formatMilliseconds(sw.getCurrent()) + "\t"
+ Mis.ramStatistics(Mis.BYTE_TO_MB_FACTOR, true) + "\t, byGdx:" + Gdx.app.getJavaHeap();
System.out.println(f);
} else if (Gdx.app.getJavaHeap() < held) {
// if gc worked
held = current;
String f = "DOWN: " + Mis.formatMilliseconds(sw.getCurrent()) + "\t"
+ Mis.ramStatistics(Mis.BYTE_TO_MB_FACTOR, true) + "\t, byGdx:" + Gdx.app.getJavaHeap();
System.out.println(f);
}
}
#Override
public void dispose() {
batch.dispose();
}
}
unfortunate results:
UP: 1 ms Heap: (11/1796) MB, 0.59153545% , byGdx:11140048
UP: 58 sec, 58171 ms Heap: (12/1796) MB, 0.62772965% , byGdx:11821672
UP: 1 min, 111 sec, 111705 ms Heap: (12/1796) MB, 0.66392213% , byGdx:12503264
UP: 32 min, 1978 sec, 1978210 ms Heap: (25/1796) MB, 1.3516115% , byGdx:25454120
UP: 48 min, 2887 sec, 2887645 ms Heap: (31/1796) MB, 1.6773589% , byGdx:31588736
After 56 min of increasing (up to 34mb), gc cleaned up (26mb), libgdx started again increasing the heap size for no particular reasons.
what pushed me to ask this question is a very tiny game has only 2 texture, libgdx caused 66mb within 2 minutes, I thought I'm using libgdx wrongly, but I am not.

Java is a managed language, which means that it (the JVM/GC) manages the memory for you. So unless you have an actual issue (e.g. you run out of memory), it is typically not very useful to add any conclusion solely based on the overall memory usage. For all you know, the JVM/GC might have never released any (temporary) objects yet, simply because there was no reason to do so.
If you think there really is an issue caused by too much memory usage (e.g. you run out of memory), then you better use a profiler to actually see what's going on. But from your description it looks like you don't have an actual issue.

Related

Efficient way to iterate collection objects in Groovy 2.x?

What is the best and fastest way to iterate over Collection objects in Groovy. I know there are several Groovy collection utility methods. But they use closures which are slow.
The final result in your specific case might be different, however benchmarking 5 different iteration variants available for Groovy shows that old Java for-each loop is the most efficient one. Take a look at the following example where we iterate over 100 millions of elements and we calculate the total sum of these numbers in the very imperative way:
#Grab(group='org.gperfutils', module='gbench', version='0.4.3-groovy-2.4')
import java.util.concurrent.atomic.AtomicLong
import java.util.function.Consumer
def numbers = (1..100_000_000)
def r = benchmark {
'numbers.each {}' {
final AtomicLong result = new AtomicLong()
numbers.each { number -> result.addAndGet(number) }
}
'for (int i = 0 ...)' {
final AtomicLong result = new AtomicLong()
for (int i = 0; i < numbers.size(); i++) {
result.addAndGet(numbers[i])
}
}
'for-each' {
final AtomicLong result = new AtomicLong()
for (int number : numbers) {
result.addAndGet(number)
}
}
'stream + closure' {
final AtomicLong result = new AtomicLong()
numbers.stream().forEach { number -> result.addAndGet(number) }
}
'stream + anonymous class' {
final AtomicLong result = new AtomicLong()
numbers.stream().forEach(new Consumer<Integer>() {
#Override
void accept(Integer number) {
result.addAndGet(number)
}
})
}
}
r.prettyPrint()
This is just a simple example where we try to benchmark the cost of iteration over a collection, no matter what the operation executed for every element from collection is (all variants use the same operation to give the most accurate results). And here are results (time measurements are expressed in nanoseconds):
Environment
===========
* Groovy: 2.4.12
* JVM: OpenJDK 64-Bit Server VM (25.181-b15, Oracle Corporation)
* JRE: 1.8.0_181
* Total Memory: 236 MB
* Maximum Memory: 3497 MB
* OS: Linux (4.18.9-100.fc27.x86_64, amd64)
Options
=======
* Warm Up: Auto (- 60 sec)
* CPU Time Measurement: On
WARNING: Timed out waiting for "numbers.each {}" to be stable
user system cpu real
numbers.each {} 7139971394 11352278 7151323672 7246652176
for (int i = 0 ...) 6349924690 5159703 6355084393 6447856898
for-each 3449977333 826138 3450803471 3497716359
stream + closure 8199975894 193599 8200169493 8307968464
stream + anonymous class 3599977808 3218956 3603196764 3653224857
Conclusion
Java's for-each is as fast as Stream + anonymous class (Groovy 2.x does not allow using lambda expressions).
The old for (int i = 0; ... is almost twice slower comparing to for-each - most probably because there is an additional effort of returning a value from the array at given index.
Groovy's each method is a little bit faster then stream + closure variant, and both are more than twice slower comparing to the fastest one.
It's important to run benchmarks for a specific use case to get the most accurate answer. For instance, Stream API will be most probably the best choice if there are some other operations applied next to the iteration (filtering, mapping etc.). For simple iterations from the first to the last element of a given collection choosing old Java for-each might give the best results, because it does not produce much overhead.
Also - the size of collection matters. For instance, if we use the above example but instead of iterating over 100 millions of elements we would iterate over 100k elements, then the slowest variant would cost 0.82 ms versus 0.38 ms. If you build a system where every nanosecond matters then you have to pick the most efficient solution. But if you build a simple CRUD application then it doesn't matter if iteration over a collection takes 0.82 or 0.38 milliseconds - the cost of database connection is at least 50 times bigger, so saving approximately 0.44 milliseconds would not make any impact.
// Results for iterating over 100k elements
Environment
===========
* Groovy: 2.4.12
* JVM: OpenJDK 64-Bit Server VM (25.181-b15, Oracle Corporation)
* JRE: 1.8.0_181
* Total Memory: 236 MB
* Maximum Memory: 3497 MB
* OS: Linux (4.18.9-100.fc27.x86_64, amd64)
Options
=======
* Warm Up: Auto (- 60 sec)
* CPU Time Measurement: On
user system cpu real
numbers.each {} 717422 0 717422 722944
for (int i = 0 ...) 593016 0 593016 600860
for-each 381976 0 381976 387252
stream + closure 811506 5884 817390 827333
stream + anonymous class 408662 1183 409845 416381
UPDATE: Dynamic invocation vs static compilation
There is also one more factor worth taking into account - static compilation. Below you can find results for 10 millions element collection iterations benchmark:
Environment
===========
* Groovy: 2.4.12
* JVM: OpenJDK 64-Bit Server VM (25.181-b15, Oracle Corporation)
* JRE: 1.8.0_181
* Total Memory: 236 MB
* Maximum Memory: 3497 MB
* OS: Linux (4.18.10-100.fc27.x86_64, amd64)
Options
=======
* Warm Up: Auto (- 60 sec)
* CPU Time Measurement: On
user system cpu real
Dynamic each {} 727357070 0 727357070 731017063
Static each {} 141425428 344969 141770397 143447395
Dynamic for-each 369991296 619640 370610936 375825211
Static for-each 92998379 27666 93026045 93904478
Dynamic for (int i = 0; ...) 679991895 1492518 681484413 690961227
Static for (int i = 0; ...) 173188913 0 173188913 175396602
As you can see turning on static compilation (with #CompileStatic class annotation for instance) is a game changer. Of course Java for-each is still the most efficient, however its static variant is almost 4 times faster than the dynamic one. Static Groovy each {} is faster 5 times faster than the dynamic each {}. And static for loop is also 4 times faster then the dynamic for loop.
Conclusion - for 10 millions elements static numbers.each {} takes 143 milliseconds while static for-each takes 93 milliseconds for the same size collection. It means that for collection of size 100k static numbers.each {} will cost 0.14 ms and static for-each will take 0.09 ms approximately. Both are very fast and the real difference starts when the size of collection explodes to +100 millions of elements.
Java stream from Java compiled class
And to give you a perspective - here is Java class with stream().forEach() on 10 millions of elements for a comparison:
Java stream.forEach() 87271350 160988 87432338 88563305
Just a little bit faster than statically compiled for-each in Groovy code.

DI containers leak memory or BenchmarksDotNet MemoryDiagnoser delivers inaccurate measurements?

Introduction
We are trying to catch potential memory leaks using BenchmarksDotNet.
For the simplicity of example, here is an unsophisticated TestClass:
public class TestClass
{
private readonly string _eventName;
public TestClass(string eventName)
{
_eventName = eventName;
}
public void TestMethod() =>
Console.Write($#"{_eventName} ");
}
We are implementing benchmarking though NUnit tests in netcoreapp2.0:
[TestFixture]
[MemoryDiagnoser]
public class TestBenchmarks
{
[Test]
public void RunTestBenchmarks() =>
BenchmarkRunner.Run<TestBenchmarks>(new BenchmarksConfig());
[Benchmark]
public void TestBenchmark1() =>
CreateTestClass("Test");
private void CreateTestClass(string eventName)
{
var testClass = new TestClass(eventName);
testClass.TestMethod();
}
}
The test output contains following summary:
Method | Mean | Error | Allocated |
--------------- |-----:|------:|----------:|
TestBenchmark1 | NA | NA | 0 B |
The test output also contains all the Console.Write output which proves that 0 B here means no memory was leaked rather than no code was run because of compiler optimization.
Problem
The confusion begins when we attempt to resolve TestClass with TinyIoC container:
[TestFixture]
[MemoryDiagnoser]
public class TestBenchmarks
{
private TinyIoCContainer _container;
[GlobalSetup]
public void SetUp() =>
_container = TinyIoCContainer.Current;
[Test]
public void RunTestBenchmarks() =>
BenchmarkRunner.Run<TestBenchmarks>(new BenchmarksConfig());
[Benchmark]
public void TestBenchmark1() =>
ResolveTestClass("Test");
private void ResolveTestClass(string eventName)
{
var testClass = _container.Resolve<TestClass>(
NamedParameterOverloads.FromIDictionary(
new Dictionary<string, object> {["eventName"] = eventName}));
testClass.TestMethod();
}
}
The summary indicates 1.07 KB was leaked.
Method | Mean | Error | Allocated |
--------------- |-----:|------:|----------:|
TestBenchmark1 | NA | NA | 1.07 KB |
Allocated value increases proportionally to the number of ResolveTestClass calls from TestBenchmark1, the summary for
[Benchmark]
public void TestBenchmark1()
{
ResolveTestClass("Test");
ResolveTestClass("Test");
}
is
Method | Mean | Error | Allocated |
--------------- |-----:|------:|----------:|
TestBenchmark1 | NA | NA | 2.14 KB |
This indicates that either TinyIoC is keeping the reference to each resolved object (which does not seem to be true according to source code) or BenchmarksDotNet measurements include some additional memory allocations outside of method marked with [Benchmark] attribute.
The config used in both cases:
public class BenchmarksConfig : ManualConfig
{
public BenchmarksConfig()
{
Add(JitOptimizationsValidator.DontFailOnError);
Add(DefaultConfig.Instance.GetLoggers().ToArray());
Add(DefaultConfig.Instance.GetColumnProviders().ToArray());
Add(Job.Default
.WithLaunchCount(1)
.WithTargetCount(1)
.WithWarmupCount(1)
.WithInvocationCount(16));
Add(MemoryDiagnoser.Default);
}
}
By the way, replacing TinyIoC with Autofac dependency injection framework didn't change the situation much.
Questions
Does it mean all DI framework have to implement some sort of cache for resolved objects? Does it mean BenchmarksDotNet is used in wrong way in a given example? Is it a good idea to hunt for memory leaks with the combination of NUnit and BenchmarksDotNet in the first place?
I am the person who implemented MemoryDiagnoser for BenchmarkDotNet and I am very happy to answer this question.
But first I am going to describe how the MemoryDiagnoser works.
It gets the number of allocated memory by using available API.
It performs one extra iteration of benchmark runs. In your case, it's 16 (.WithInvocationCount(16))
It gets the number of allocated memory by using available API.
final result = (totalMemoryAfter - totalMemoryBefore) / invocationCount
How accurate is the result? It is as accurate as the available APIs that we are using: GC.GetAllocatedBytesForCurrentThread() for .NET Core 1.1+ and AppDomain.MonitoringTotalAllocatedMemorySize for .NET 4.6+.
The thing called GC Allocation Quantum defines the size of allocated memory. It is usually 8k bytes.
What does it really mean: if we allocate a single object with new object() and GC needs to allocate memory for it (the current segment is full) it's going to allocate 8k of memory. And both APIs are going to report 8k memory allocated after a single object allocation.
Console.WriteLine(AppDomain.MonitoringTotalAllocatedMemorySize);
GC.KeepAlive(new object());
Console.WriteLine(AppDomain.MonitoringTotalAllocatedMemorySize);
might end up in reporting:
x
x + 8000
How BenchmarkDotNet deals with this problem? We perform a LOT of invocations (usually millions or billions), so minimize the allocation quantum size problem (it's never 8k for us).
How to fix the problem in your case: set the WithInvocationCount to a bigger number (maybe 1000).
To verify the results you might consider using some Memory Profiler. Personally I used Visual Studio Memory Profiler which is part of Visual Studio.
Another alternative is to use JetBrains.DotMemoryUnit. It's most probably the best tool in your case.

Hazelcast PagingPredicate performance

I'am trying to use PagingPredicate for map with only 340 entries. For first page with pageSize=15 it tooks about 15 ms to retrieve result, but for last page it tooks 250 ms. is it normal result?
code example:
public List<NaturalPerson> getNaturalPersonByNameAndUser(String name, User user, int offset, int limit) {
final PagingPredicate pagingPredicate = new PagingPredicate(new NaturalPersonPredicate(name,user), /*naturalPersonComparator,*/ limit);
for(int i = 0; i< offset; i = i + limit){
pagingPredicate.nextPage();
}
return Lists.newArrayList(naturalPersonMap.values(pagingPredicate));
}
Totally depends. How many times did you run the test btw? Writing a microbenchmark is quite complicated. Personally I use JMH for it.
Can you try it with JMH and see if the numbers still the same?
For an example with Hazelcast:
https://github.com/hazelcast/performancetop5/tree/master/item1

Can JVM to encounter “java.lang.OutOfMemoryError:" despite having sufficient PermSize and OldGen space

If application has sufficient space for PermSize and OldGen space, is it sill possible to encounter OutOfMemoryErrors?
Besides Perm Gen and Old Gen. JVM may use non-heap memory (e.g. for direct memory buffers).
Amount of non-heap memory is limited by -XX:MaxDirectMemorySize options. If it is exceeded OutOfMemoryError will be thrown.
Yes. Someone in your code base could throw it or Sun ... er Oracle ;) might throw it. For example look at this code from ByteArrayOutputStream:
/**
* Increases the capacity to ensure that it can hold at least the
* number of elements specified by the minimum capacity argument.
*
* #param minCapacity the desired minimum capacity
*/
private void grow(int minCapacity) {
// overflow-conscious code
int oldCapacity = buf.length;
int newCapacity = oldCapacity << 1;
if (newCapacity - minCapacity < 0)
newCapacity = minCapacity;
if (newCapacity < 0) {
if (minCapacity < 0) // overflow
throw new OutOfMemoryError();
newCapacity = Integer.MAX_VALUE;
}
buf = Arrays.copyOf(buf, newCapacity);
}
http://www.docjar.com/html/api/java/io/ByteArrayOutputStream.java.html

Progress Bar with countdown timer in android

I need a progress bar in my layout, which will have a total time of 30 secs and will tick every second. Basically I want the user of my app to see that he has 30 sec time before time is up.
This is the piece of code I have written.
But this gives me a blank progress bar with no activity. Please help.
What am I doing wrong
public class MySeekBarActivity extends Activity {
/** Called when the activity is first created. */
#Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
setProgressBarVisibility(true);
final ProgressBar progressHorizontal = (ProgressBar) findViewById(R.id.progress_horizontal);
progressHorizontal.setProgress(progressHorizontal.getProgress()* 100);
new CountDownTimer(30000, 1000) {
public void onTick(long millisUntilFinished) {
progressHorizontal.incrementProgressBy(1);
int dtotal = (int) ( 30000 - millisUntilFinished ) /30000 * 100;
progressHorizontal.setProgress(dtotal);
}
public void onFinish() {
// DO something when 2 minutes is up
}
}.start();
}
}
You've got a type conversion bug, due to two things:
you're dividing by an int, which causes the decimal to be rounded down,
also, you're casting the result too early, so even if you'd divide by a float/double, the result would get rounded down anyway.
To see what I mean - you can safely remove the cast to int from your code, and it will compile anyway. That means your final number is an int, and since you're not making any casts earlier, it means you're losing the decimal info pretty early on in the code.
This is a possible fix:
int dtotal = (int) (( 30000 - millisUntilFinished ) /(double)30000 * 100);
to resolve such bugs in the future, make a dummy Java program with a loop containing the equation, and print out the intermediate result, for example:
public class NumberTester {
//define the constants in your loop
static final int TOTAL_TIME = 30000;
static final int INTERVAL = 1000;
public static void main(String[] args) {
//perform the loop
for(int millisUntilFinished = TOTAL_TIME;millisUntilFinished >=0;millisUntilFinished -= INTERVAL) {
int dtotal = (int) (( TOTAL_TIME - millisUntilFinished ) /(double)TOTAL_TIME * 100);
System.out.println(dtotal);
}
}
}
Also, some important things:
don't start your timer in onCreate - your activity is not visible yet at this point! Use onResume instead.
kill your timer in onPause. Leaving timers and threads unmanaged like that is bad form, and may lead to weird bugs.
don't use "magic numbers". Place all your constant values in a static final class members, like I've done in the example. This will save you a lot of headaches when you decide to change those values.
EDIT: as to why your progress bar stops short of completion, that's because the onTick method works a bit differently than you're probably assuming it does. To see what I mean, add:
System.out.println("Milis:" + millisUntilFinished);
System.out.println("dtotal:" + dtotal);
to your onTick method. The values clearly don't count down to 0 (and hence 100 in the case of dtotal, it being derived from millisUntilFinished) - you have to compensate for that.

Resources