Garbage collection log for groovy? - groovy

I'm the inheritor of a groovy application. The groovy part is rather small, maybe 500 loc that's mostly used to prime and start java threads.
Now the sys admin people comes to me with tales of woe, with tales of the dreaded OOME.
With a java app I would take a look at the gc log but here there is none such.
How do I get a gc log for groovy? Is it possible?
I've googled around for quite a bit to no avail.
Anybody with any ideas?

With a java app I would take a look at the gc log but here there is none such.
groovy runs on the JVM, it accepts the same JVM options as java applications would. You can pass them via the JAVA_OPTS environment variable

Related

System.gc() calls in Wildfly

I am currently using Wildlfly 10.1. in production and just discovered that we have a lot of gc pause times. Analysis of the gc log exposed that 95% of the gc runs are triggered by System.gc() calls. Our application code does not invoke any of them.
Is this a Wildfly feature?
Or can someone point me in the right direction to figure out if these System.gc() invokations make sense?
Of course, I am aware that there is a number of measures to optimize gc behavoir. I am just asking myself why there are so many System.gc() calls.
System.gc() is called by Java RMI or more specifically by sun.misc.GC class - Sourcecode
Default interval is 1 hour. You can set it by using these parameters:
-Dsun.rmi.dgc.client.gcInterval=3600000
-Dsun.rmi.dgc.server.gcInterval=3600000
Setting -XX:+DisableExplicitGC can make your application slower and slower over time.
See also: What is the default Full GC interval in Java 8
If you want to find callers for System.gc the most reliable method is to attach a debugger and set a method entry breakpoint on it.

Coding with Revit API: tips to reduce memory use?

I have a quite 'general' question. I am developing with Revit API (with python), and I am sometimes observing that the Revit session gets slower during my tests and trials (the longer Revit stays open, the more it seems to happen). It's not getting to the point where it would be really problematic, but it made me think about it anyway..
So, since I have no programming background, I am pretty sure that my code is filled with really 'unorthodox' things that could be far better.
Would there be some basic 'tips and tricks' that I could follow (I mean, related to the Revit API) to help the speed of code execution? Or maybe should I say: to help reducing the memory use?
For instance, I've read about the 'Dispose' method available, notably when using Transactions (for instance here: http://thebuildingcoder.typepad.com/blog/2012/09/disposal-of-revit-api-objects.html ), but it's not very clear to me in the end if that's actually very important to do or not (and furthermore, since I'm using Python, and don't know where that puts me in the discussion about using "using" or not)?
Should I just 'Dispose' everything? ;)
Besides the 'Dispose' method, is there something else?
Thanks a lot,
Arnaud.
Basics:
Okay let's talk about a few important points here:
You're running scripts under IronPython which is an implementation of python in C# language
C# Language uses Garbage Collectors to collect unused memory.
Garbage Collector (GC) is a piece of program that is executed at intervals to collect the unused elements. It uses a series of techniques to group and categorize the target memory areas for later collection.
Your main program is halted by the operating system to allow the GC to collect memory. This means that if the GC needs more time to do its job at each interval, your program will get slow and you'll experience some lag.
Issue:
Now to the heart of this issue:
python is an object-oriented programming language at heart and IronPython creates objects (Similar to Elements in Revit in concept) for everything, from your variables to methods of a class to functions and everything else. This means that all these objects need to be collected when they're not used anymore.
When using python as a scripting language for a program, there is generally one single python Engine that executes all user inputs.
However Revit does not have a command prompt and an associated python engine. So every time you run a script in Revit, a new engine is created that executes the program and dies at the end.
This dramatically increases the amount of unused memory for the GC to collect.
Solution:
I'm the creator and maintainer of pyRevit and this issue was resolved in pyRevit v4.2
The solution was to set LightweightScopes = true when creating the IronPython engine and this will force the engine to create smaller objects. This dramatically decreased the memory used by IronPython and increased the amount of time until the user experiences Revit performance degradation.
Sorry i can't comment with a low reputation, i use another way to reduce memory, it's less pretty than the LightweightScopes trick, but it works for one-time cleanup after expensive operations :
import gc
my_object = some_huge_object
# [operation]
del my_object # or my_object = [] does the job for a list or dict
gc.collect()

How to prevent GC when using Caliper

When using caliper, I get the
ERROR: GC occurred during timing.
as some garbage gets produced in my benchmark, which I can't avoid. I guess, giving more memory to the target JVM could help, as there's not that much garbage. I'm aware about the -D and -J options, but somehow it doesn't work for me.
Firstly, I see in this question that multiple arguments passed via Jmemory=-Xmx512M,-Xmx16M get used separately, i.e., each comma separated argument leads to a new run. But I'd like to pass multiple arguments to be used together like maybe -Xmx16G -XX:NewSize=12G, so that the GC gets postponed as much as possible (and actually doesn't come at all as the run finishes in the meantime). How can I do it?
Secondly, what are the best arguments posponing the GC as much as possible? I mean, give the JVM a lot of memory (-Xmx), use it all for Eden, and don't care about how full it gets.

Find a String version of a GroovyConsole Script in a Heap Dump

I've accidentally ran a script with an infinite loop in GroovyConsole. :-\
For the sake of Murphy's Law, I haven't save my work during 3 or 4 hours. So, before killing the GroovyConsole Process, I've dumped the heap, with the hope to find a String version of the Script that was running at this moment
Do you have a hint in which Class it can hide, or if it is possible ?
So, it happens that my guess was right. The groovy.ui.Console Object keeps an history of changes of the script. I give you the OQL query that gave me my script back for my greatest pleasure. I ran it in VisualVM with the OQL plugin, but I could have used jhat :
select x.history.elementData[x.history.elementData.length-2].allText.toString() from groovy.ui.Console x
Despaired groovy developers who've lost their code once might be releaved now :-) For sure I am
The string version of the script may exists in another object. I'd love to hear other solutions

Using TDD to drive out thread-safe code

What's a good way to leverage TDD to drive out thread-safe code? For example, say I have a factory method that utilizes lazy initialization to create only one instance of a class, and return it thereafter:
private TextLineEncoder textLineEncoder;
...
public ProtocolEncoder getEncoder() throws Exception {
if(textLineEncoder == null)
textLineEncoder = new TextLineEncoder();
return textLineEncoder;
}
Now, I want to write a test in good TDD fashion that forces me to make this code thread-safe. Specifically, when two threads call this method at the same time, I don't want to create two instances and discard one. This is easily done, but how can I write a test that makes me do it?
I'm asking this in Java, but the answer should be more broadly applicable.
You could inject a "provider" (a really simple factory) that is responsible for just this line:
textLineEncoder = new TextLineEncoder();
Then your test would inject a really slow implementation of the provider. That way the two threads in the test could more easily collide. You could go as far as have the first thread wait on a Semaphore that would be released by the second thread. Then success of the test would ensure that the waiting thread times out. By giving the first thread a head-start you can make sure that it's waiting before the second one releases.
It's hard, though possible - possibly harder than it's worth. Known solutions involve instrumenting the code under test. The discussion here, "Extreme Programming Challenge Fourteen" is worth sifting through.
In the book Clean Code there are some tips on how to test concurrent code. One tip that has helped me to find concurrency bugs, is running concurrently more tests than the CPU has cores.
In my project, running the tests takes about 2 seconds on my quad core machine. When I want to test the concurrent parts (there are some tests for that), I hold down in IntelliJ IDEA the hotkey for running all tests, until I see in the status bar that 20, 50 or 100 test runs are in execution. I follow in Windows Task Manager the CPU and memory usage, to find out when all the test runs have finished executing (memory usage goes up by 1-2 GB when they all are running and then slowly goes back down).
Then I close one by one all the test run output dialogs, and check that there were no failures. Sometimes there are failed tests or tests which are in deadlock, and then I investigate them until I find the bug and have fixed it. That has helped me to find a couple of nasty concurrency bugs. The most important thing, when facing an exception/deadlock that should not have happened, is always assuming that the code is broken, and investigating the reason ruthlessly and fixing it. There are no cosmic rays which cause programs to crash randomly - bugs in code cause programs to crash.
There are also frameworks such as http://www.alphaworks.ibm.com/tech/contest which use bytecode manipulation to force the code to do more thread switching, thus increasing the probability of making concurrency bugs visible.
When I test drove an implementation that needed to be thread safe recently I came up with the solution I provided as an answer for this question. Hope that helps even though there are no tests there. Hope link is OK raher than duplicating teh answer...
Chapter 12 of Java Concurrency in Practice is called "Testing Concurrent Programs". It documents testing for safety and liveness, but says this is a hard subject. I am not sure this problem is solvable by the tools of that chapter.
Just off the top of my head could you compare the instances returned to see if they are indeed the same instance or if they are different? That's probably where I would start with C#, I would imagine you can do the same in java

Resources