javaw.exe consumes memory on starting STS - memory-leaks

At first I thought my program had memory leaks. But I terminated all java processes and restarted Spring Tools Suite. I kept an eye on the task manager. In just a few minutes, javaw.exe had grown to 2,000,000 K Memory. The memory keeps going up, without issuing commands in STS. STS has literally ONLY been opened. I have no tabs open in it. The error log doesn't show any memory related errors. Upon closing STS javaw.exe DOES disappear from task manager and opening STS restarts the process over again around 150,000K, quickly jumping to 600,000K, then slowly growing and growing until it has consumed all my memory.
Any thoughts what might be causing this? I'm running a full system scan now just in case I've been compromised.
--edit--
This problem started around 10 AM Eastern and mysteriously went away at noon, when the security scan completed. No items were detected by the scan to lend an explanation to either the problem or its mysterious resolution. As of now javaw.exe is hovering at or around 700,000K. Very strange!

Sounds like a 2 hour bug! Be thankful it is gone but be sure to document it thoroughly if it occurs again. Sounds like a rough 2 hours you went through.
That is not completely unusual unfortunately. Because Eclipse is made up of a bunch of plug-ins some times a plug-in can go wild and start consuming memory and/or CPU. Using VisualVM (http://visualvm.java.net/) you can determine what is causing Eclipse to freak out. Depending on what it is, you might be able to disable that functionality. Because it could be so many different plug-ins it doesn’t surprise me you could not find any answers googling or looking here at StackOverflow.

Related

Sysmgr volatile memory bug

As you may know that there is a bug with Nexus switches called SYSMGR-2-VOLATILE_DB_FULL with versions below System version: 5.0(2)N2(1) that causes a switch to crash reboot once dir /dev/shm gets to 100% unless updated to a later version.
in order to fill the dir you can run long commands such as "show run" (needs to be over 190 lines) and then check how it increases by running
show system internal flash
show system internal dir /dev/shm | i csm_acfg | count
I was wondering if there is a similar bug with 4500 switches ?
Catalyst 4500 L3 Switch Software (cat4500es8-UNIVERSALK9-M), Version 03.11.00.E RELEASE SOFTWARE (fc3)
So what exactly happened...
I have a script that runs from time to time that gets over 190 lines from all of our switches and performs some action remotely, so recently when the script ran a few minutes later we had a massive outage since our core switch had a power outage( at least what I was able to see from the logs) The thing is there are 2 4500 chassis configured with sso redundancy so the failover should have been instantaneous, however everything was down for about 8 mins before the standby switch became active.
Can anyone please advise if there is a similar bug with 4500 switches ?
Thank you.
After analysing crash info I was able to find some things that caused the crash, however wont be able to tell with 100 % certainty what exactly happened to crash it
So there are a few errors that are called VFETQINTERRUPT and VFETQTOOMANYPARITYERRORS basically VFETQINTERRUPT counts fast accruing errors and VFETQTOOMANYPARITYERRORS will crash reboot switch if exceeds 100 errors in a short period of time, could indicate that there is a hardware error
and this is pretty much what happened in out environment, something has caused 100+ errors and it crashed rebooted.
There is a command to stop it from crash rebooting, however not sure if it should be used as if there is a hardware issue it better to failover onto the other supervisor.
platform fw-asic dbl hash memory parity-error reload never

about managing file system space

Space Issues in a filesystem on Linux
Lets call it FILESYSTEM1
Normally, space in FILESYSTEM1 is only about 40-50% used
and clients run some reports or run some queries and these reports produce massive files about 4-5GB in size and this instantly fills up FILESYSTEM1.
We have some cleanup scripts in place but they never catch this because it happens in a matter of minutes and the cleanup scripts usually clean data that is more than 5-7 days old.
Another set of scripts are also in place and these report when free space in a filesystem is less than a certain threshold
we thought of possible solutions to detect and act on this proactively.
Increase the FILESYSTEM1 file system to double its size.
set the threshold in the Alert Scripts for this filesystem to alert when 50% full.
This will hopefully give us enough time to catch this and act before the client reports issues due to FILESYSTEM1 being full.
Even though this solution works, does not seem to be the best way to deal with the situation.
Any suggestions / comments / solutions are welcome.
thanks
It sounds like what you've found is that simple threshold-based monitoring doesn't work well for the usage patterns you're dealing with. I'd suggest something that pairs high-frequency sampling (say, once a minute) with a monitoring tool that can do some kind of regression on your data to predict when space will run out.
In addition to knowing when you've already run out of space, you also need to know whether you're about to run out of space. Several tools can do this, or you can write your own. One existing tool is Zabbix, which has predictive trigger functions that can be used to alert when file system usage seems likely to cross a threshold within a certain period of time. This may be useful in reacting to rapid changes that, left unchecked, would fill the file system.

Netlogo 5.1 (and 5.05) Behavior Space Memory Leak

I have posted on this before, but thought I had tracked it down to the NW extension, however, memory leakage still occurs in the latest version. I found this thread, which discusses a similar issues, but attributes it to Behavior Space:
http://netlogo-users.18673.x6.nabble.com/Behaviorspace-Memory-Leak-td5003468.html
I have found the same symptoms. My model starts out at around 650mb, but over each run the private working set memory rises, to the point where it hits the 1024 limit. I have sufficient memory to raise this, but in reality it will only delay the onset. I am using the table output, as based on previous discussions this helps, and it does, but it only slows the rate of increase. However, eventually the memory usage rises to a point where the PC starts to struggle. I am clearing all data between runs so there should be no hangover. I noticed in the highlighted thread that they were going to run headless. I will try this, but I wondered if anyone else had noticed the issue? My other option is to break the BehSpc simulation into a few batches so the issues never arises, bit i would be nice to let the model run and walk away as it takes around 2 hours to go through.
Some possible next steps:
1) Isolate the exact conditions under which the problem does or not occur. Can you make it happen without involving the nw extension, or not? Does it still happen if you remove some of the code from your model? What if you keep removing code — when does the problem go away? What is the smallest code that still causes the problem? Almost any bug can be demonstrated with only a small amount of code — and finding that smallest demonstration is exactly what is needed in order to track down the cause and fix it.
2) Use standard memory profiling tools for the JVM to see what kind of objects are using the memory. This might provide some clues to possible causes.
In general, we are not receiving other bug reports from users along these lines. It's routine, and has been for many years now, for people to use BehaviorSpace (both headless and not) and do experiments that last for hours or even for days. So whatever it is you're experiencing almost certainly has a more specific cause -- mostly likely, in the nw extension -- that could be isolated.

Memory leak with apache, tomcat & mod_jk & mysql

I'm running tomcat 7 with apache 2.2 & mod_jk 1.2.26 on a debian-lenny x64 server with 2GB of RAM.
I've a strange problem with my server: every several hour & sometimes (under load) every several minutes, my tomcat ajp-connector pauses with a memory leak error, but seems this error also effects some other parts of system (e.g some other running applications also stop working) & I have to reboot the server to solve the problem for a while.
I've checked catalina.out for several days, but it seem's there is not a unique error pattern just before pausing ajp with this message:
INFO: Pausing ProtocolHandler ["ajp-bio-8009"]
Sometimes there is this message before pausing:
Exception in thread "ajp-bio-8009-Acceptor-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)...
& sometimes this one:
INFO: Reloading Context with name [] has started
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5482)
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:230)
at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3847)
at org.apache.catalina.loader.WebappLoader.backgroundProcess(WebappLoader.java:424)
at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1214)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1400)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1389)
at java.lang.Thread.run(Thread.java:619)
java.sql.SQLException: null, message from server: "Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug"...
& some other times the output messages related to some other parts of program.
I've checked my application source code & I don't guess it causes the problem, I've also checked memory usage using jConsole. The wanderfull point is that when server fails, is shows a lot of free memory on both heap & non-heap jvm memory space. As I told before, after crashing server, many other applications also fail & when I want to restart them it gives a resource temporary unavailable message (I've also checked my limits.conf file).
So I really really confused with this serious problem many days & i have really no more idea about it. So, can anybody please give me any kind of suggestion to solve this complicated & unknown problem ???
What could be the most possible reason for this error ?
What are your limits for number of processes?
Check them with uname -a and check maximum number of processes. If it's 1024, increase it.
Also, check the same thing for user which you are using to start it (for example, if you are using nobody user for your stuff, run su -c "ulimit -a" -s /bin/sh nobody to see what actually this user sees as limits). That should show you a problem (had it couple of days ago, totally missed to check this).
In the moment when that starts happening, you can also count all your running threads and processes for that user (or even better to monitor it using rrdtool or something else) with "ps -eLf | wc -l" which will give you back simple count of all processes and threads running on your system. This information, together with limits for all particular users, should solve your issue.
Use jvisualvm to check the heap usage of your jvm. If you see it slowly climbing over a period of time, that is a memory leak. Sometimes a memory leak is short term and eventually gets cleared up, only to start again.
If you see a sawtooth pattern, take a heap dump near the peak of the sawtooth, otherwise take a heapdump after the jvm has been running long enough to be at a high risk of and OOM error. Then copy that .hprof file to another machine and use the Eclipse MAT (Memory Analysis Tool) to open it up and identify likely culprits. You will still need to spend some time following refs in the data structure and also reading some Javadocs to figure out just what is using that Hashmap or List that is growing out of control. The sorting options are also useful to focus on the most likely problem areas.
There are no easy answers.
Note that there is also a command line tool included with the SUN jvm which can trigger a heapdump. And if you have a good profiler that can also be of use because memory leaks are usually in a piece of code that is executed frequently and therefore will show up as a hot spot in the profiler.
I finally found the problem: it was not actually a memory leak, but the limitation in number of allowed threads for the VPS was caused the problem. My server was a Xen vps with default limitation of 256 threads, so when it reached the maximum allowed threads, the supervisor was killed some of running threads (that was cause of stopping some of my running processes). By increasing number of allowed threads to 512, the problem totally solved (of course if I increase maxThreads in tomcat settings, its obvious that the problem will rise again).

How to fix GC error in Mac Common Lisp 5.0?

I'm fairly new to Lisp, and I'm trying to run an algorithmic music application on the original MCL 5.0 (not the RMCL version). The program works by incrementally inputting textual representations of music, and learning from the user via an association net. Unfortunately, shortly after I begin inputting the text, I begin to see the GC icon flash. The more text I input, the longer the GC will appear, until finally it will last so long that the application will crash. I've been talking with the creator of this application, and he's never had this problem. Any ideas as to how I might fix this? Perhaps somehow altering my MCL's GC preferences?
On a side note, when I input the text and the GC icon is flashing, in Activity Monitor it shows MCL using around 90% of my CPU's processing power, but not much RAM.
MCL on what OS and Mac?
It could be that MCL starts up with too little memory. Possible reasons: it is configured for too little memory, the Mac has too little free memory for some reason.
(room t)
shows details about the available memory.
It can also be that the program takes up too much memory when running. Reasons for that: it is not compiled or the available memory is too small.
Generally I would propose to use the MCL user mailing list for these questions.
Send a message with the text 'help' in the body to info-mcl-request # digitool.com (remove the spaces). You will get a message how to subscribe. The actual mailing list is info-mcl # digitool.com (again without the spaces).

Resources