We have an old version of JBoss running multiple apps and we get perm gen errors after multiple deploys. I believe it is due to a classloader leak. It turns out that this is due to a bug that they have decided to not fix:
https://issues.apache.org/bugzilla/show_bug.cgi?id=46221
The short and skinny of that link is that you get a classloader leak simply from using log4j and they aren't fixing it.
So is there a there a way for me to fix the classloader leak so I don't need to restart the server every two weeks?
I'm hoping to get around upgrading the server, but if I can change configurations, apply some sort of patch, or perhaps reset the log file somehow, that would be great.
The Bug has an attached patch. Did you try that? Going from jboss4 to 5 is not that painful, it would probably be easier to upgrade then to play around with a patch.
Related
We recently added Hazelcast to one of our applications and noticed this NPE coming in our logs without obvious reasons.
We are using Hazelcast 3.11 and there are twenty members in the cluster running on four physical servers.
We use Hazelcast to share some locks and a map across different JVMs.
[24/08/19 17:50:10:586 EST] 000000ba ExecutionServ E com.hazelcast.spi.ExecutionService [SERVERNAME]:5701 [xyz] [3.11.3] Failed to execute java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask#b20b531
java.lang.NullPointerException
at com.hazelcast.crdt.CRDTReplicationTask.replicate(CRDTReplicationTask.java:101)
at com.hazelcast.crdt.CRDTReplicationTask.run(CRDTReplicationTask.java:67)
at com.hazelcast.spi.impl.executionservice.impl.DelegateAndSkipOnConcurrentExecutionDecorator$DelegateDecorator.run(DelegateAndSkipOnConcurrentExecutionDecorator.java:77)
at com.hazelcast.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:227)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:906)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:929)
at java.lang.Thread.run(Thread.java:773)
at com.hazelcast.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:64)
at com.hazelcast.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:80)
Given our application is very critical I would like to understand what would potentially cause it and what would be the consequences. Our application seems to be working normally around the places where we use Hazelcast.
Thank you in advance for your inputs.
This issue seems to have been logged with Hazelcast and fixed in September 2018:
https://github.com/hazelcast/hazelcast/pull/13706
But it looks like the issue never made it into one of the hazelcast releases. See the release notes, no mention of bug 13706:
https://docs.hazelcast.org/docs/rn/index.html#3-12-2
I asked if/when this issue will be released (if not already) on the hazecast pull request (1st link above).
One thing you could try, just in case they pulled the fix into one release, would be to test with hazelcast 3.12.2 (latest release), maybe they pulled in the fix but didn't mention it in the release notes?
Can you help me?
I tried jmeter testing with nodejs server but after some 5000 requests node server doesn't respond, So I have to restart the server to make it work again. is there any way to make it work again without restarting the server?
What you are asking is a way to treat the symptoms without treating the cause. I know of no way to "make it work again" but if we can find the cause of your problem we can fix it and remove the symptoms. It is difficult to comment on what exactly is happening without more information/code, but two things come to mind.
You may be performing some very heavy computations and accidentally blocking your event loop. This article discusses it in more detail.
You may have a memory leak which is crashing node. This is easy to check by watching the memory usage of node on windows task manager or a mac/linux equivalent. If the memory keeps increasing and never falls, Node may reach its max memory limit and crash. The only way I know of to fix this is to run the node garbage collector manually. This article talks about it. This is of course a temporary solution, you should fix the memory leak if you find one.
Those are the only two I can think of. If you want more help, I'll need to see your code.
you can use 'forever': https://github.com/foreverjs/forever
but I think you know why the script fails.
Good luck
I recently upgraded from Jenkins 1.6 to 2.5. After I did this, I noticed very high CPU usage, sometimes over 300% (there are only 4 cores, so I don't think it could go over 400%). I'm not sure where to begin debugging this, but here's a thread dump and some screenshots from top/htop
htop
top:
As it turned out, my issue was that several jobs had thousands of old builds. This was fine in Jenkins 1.6 but it's a problem in 2.5 (I guess maybe Jenkins tries to load all the builds into memory when you view the job overview page). To fix it, I just deleted most of the old builds from the problem jobs using this strategy and then reloaded jenkins. Worked like a charm!
I also set the "discard old builds" plugin to keep only the 50 most recent builds, to prevent this from happening again.
Whenever a request comes in, Jenkins will spawn some threads to serve the request. After upgrading Jenkins, it might have invoked at high throttle at that time. Plz check the CPU and memory usage of Jenkins server while the following scenarios :
Jenkins is idle and no other apps are running on the server.
Scheduled a build and no other apps are running on the server.
And compare the behaviors which could help you out to determine whether Jenkins or running jenkins in parallel with other apps are really making trouble.
As #vlp said, try to monitor the jenkins application via JVisualVM with Jstad configuration to hook in. Refer this link to Configure JvisualVM with Jstad.
I have noticed a couple of reasons for abnormal CPU usage with my Jenkins install on Windows 7 Ultimate.
I had recently upgraded from v2.138 to v2.140 plus added a few additional plugins. I started noticing a problem with the Jenkins java executable taking up to 60% of my CPU time every time a job would trigger. None of the jobs were CPU bound, just grabbing data from external servers, so it didn't make any sense. It was fixed with a simple restart of the Jenkins service. I assume the upgrade just didn't finish cleanly.
Java Garbage Collection was throwing errors and hogging the CPU when running with the default memory settings. It was probably overkill, but I went wild and upped the Java Heap Space for Jenkins from the default 256mb to 4gb; which solved this problem for me.See this solution for instructions:
https://stackoverflow.com/a/8122566/4479786
2.5 seems to be a development release, while 1.6 is their Long Term Support version. Thus it seems logical that you should expect some regressions when using the bleeding edge version. The bounty on this question is proof that other users are experiencing this as well. The solution is to report a bug on the Jenkins bug tracker. You can temporarily downgrade to the known good version for now.
Try passwing following argument to jenkins:
-Dhudson.util.AtomicFileWriter.DISABLE_FORCED_FLUSH=true
as mentioned here: https://issues.jenkins-ci.org/browse/JENKINS-52150
I want to upgrade from TomEE 1.5.1 to TomEE 1.6.0.
I have some hazelcast maps that are populated during server startup.
When deployed on TomEE 1.5.1 works fast (less than a second to populate and index 2k items, including some processing in between).
When deploying the exact same WARs to TomEE 1.6.0 the same tasks takes ~4 seconds.
To complete the picture, when running unit-test with openejb.home pointing to openejb 4.6.0 - it runs perfectly well.
Any ideas?
===== edit =====
I realized that this is a bit in the air.
Here's a link to a simple war that puts 50000 items to the map.
https://drive.google.com/file/d/0B3Xw6Xt1YU4bVy16NE9Xc295LTA/edit?usp=sharing
I deployed it in apache-tomee-plus-1.5.1 and in apache-tomee-jaxrs-1.6.0. The time was ~2.5 sec and ~10 sec, respectfully.
There are emphasized output in the tomee log to indicate the time.
Sources are included.
I hope it helps in understanding and solving the issue.
basically you are stucked in hazelcast:
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.waitForResponse(BasicInvocation.java:721)
- locked <0x00000007c58b50c0> (a com.hazelcast.spi.impl.BasicInvocation$InvocationFuture)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:695)
at com.hazelcast.spi.impl.BasicInvocation$InvocationFuture.get(BasicInvocation.java:674)
at com.hazelcast.map.proxy.MapProxySupport.invokeOperation(MapProxySupport.java:239)
at com.hazelcast.map.proxy.MapProxySupport.putInternal(MapProxySupport.java:200)
at com.hazelcast.map.proxy.MapProxyImpl.put(MapProxyImpl.java:71)
at com.hazelcast.map.proxy.MapProxyImpl.put(MapProxyImpl.java:57)
You can take some thread stack in both instances to compare but TomEE didn't change enough to justify alone such a difference.
Do you use the exact same network config?
I didn't get any exception neither. Just got a thread dump when started to see if tomee was the bottlenexk or not. Since the time is spent in hazelcast TomEE shouldn't be the cause of it so you need to compare both instances.
The problem was solved by editing the objects that hazelcast serialize.
For TomEE 1.5.1, those objects were implementing the java.io.Serializable interface and the performance difference occured.
I changed it to com.hazelcast.nio.serialization.DataSerializable and things run faster and are consistent in both servers.
So, although my problem was solved, I still don't understand the behavior differences.
it can be classloading difference, tomee 1.6 tolerate a bit more classes from the webapp by default. Playing with openejb.classloader.forced-skip=package1,package2,... to exclude common classes between webapp and tomee/lib can make it faster
Migrated an application to WebSphere v8 from v6 and started getting memory leaks. The primary suspect is org.apache.axis2. It looks like each time the application calls a web service, an object called ServiceClient is created by WAS8 and stored in something that's called ClientConfigurationContextStore and then never garbage collected. Has anybody had a similar issue?
Fixed the problem by forcing original axis 1.4 over supplied soap implementation. This was done by placing two files in WEB-INF/services of the application. First file is called javax.xml.soap.MessageFactory and contains 'org.apache.axis.soap.MessageFactoryImpl' and the second is called javax.xml.soap.SOAPConnectionFactory and contains 'org.apache.axis.soap.SOAPConnectionFactoryImpl'. So now in the code this: javax.xml.soap.SOAPConnectionFactory.newInstance() returns a org.apache.axis stuff while before it was returning com.ibm.ws.webservices stuff. No memory leaks anymore.
If you don't have the problem in WebSphere v6, it's possible it is a leak in v8 itself. But it's also possible that v8 is being more strict about something that v6 was letting you get away with.
Have you checked that you're reusing all the Axis2 client objects that you can, rather than recreating ones on every call that you don't need to recreate? I recall us having some leakage in Axis2 client code under WAS v6.1 and realizing that we were recreating objects that we could be reusing instead.
In one of our projects, we used Axis2 1.6.2 as service client. Application server was WebSphere 7 and in test environment it got out of memory from time to time. When i examined heap dump AxisConfiguration class had lots of AxisService class instances. I was instantiating ServiceClient for every request and i saw that sometimes garbage collection worked late to finalize this object. So we used ServiceClient singleton and that solved our problem.