I recently did this to my system
ulimit -c unlimited
and it created as designed, a core file for the program I use, ever since, I have had random crashes to my program but I haven't had the possibility to check the core dump to see what errors it gave, as it does daily restart of the program, I assume the previous errors are gone, if they are not, please tell me so I can look them up.
But my question is: is there in any possible way that this new ulimit command I used, be the issue with the server crash? because for years ive runned the same program with no crashes and since this commmand, I have had random crashes from time to time that somewhat feels like it loops for around 5 minutes then restarts the program.
Any help is appreciated, as I cannot reproduce the issue
Related
i have an script that is taking a lot of memory to use and it is just "killed" or cannot be assigned more memory. When i run that same script on windows it consumes my whole memory and freezes my pc. I want the same to happen in my linux server instead of killing the process.
I have tried changinc vm.overcommit_memory to 0, 1 and 2 but none of them work, i tried some other things like disabling the oom killer in linux but cant find the value vm.oom-killer, please help
update:
Another solution would be to limit the memory it is using for example 10gb but if it exceedes dont consume more or kill the process, just let if finish.
I'm having a really strange problem when running a PHP script as a Daemon. First I want to say that I've been using those kind of scripts for several years now without any problem.
During the past weekend, I rebuilt one of our EC2 servers in AWS and I started to see some memory leaking from my daemon scripts..
I started monitoring one of them by adding a log on each cycle of my script.
System_Daemon::info("After a cycle peak : ".number_format((memory_get_peak_usage()/1024/1024), 2)."MB and real : ".number_format((memory_get_usage()/1024/1024), 2)."MB");
Both numbers from this log are showing the exact same number for each script cycle
[Nov 21 10:24:14] info: After a cycle peak : 5.31MB and real : 4.87MB
but when I look into the process on my system it's leaking memory. All the numbers regarding memory (VIRT,RES,SHR and %MEM) are going up until the process is shutdown by the system..
I really don't know where to start looking to fix this. The only difference I've seen before/after my server rebuild is that the PHP version has slightly changed from PHP 7.0.33-0ubuntu0.16.04.2 to PHP 7.0.33-0ubuntu0.16.04.7.
Can anyone help me understand what is going on?
Thanks.
TLDR: Can't find the core dump even after setting ulimit and looking into apport. Sick of working so hard to get a single backtrace. Questions on the bottom.
I'm having a little nightmare here. I'm currently doing some c coding, which in my case always means a metric ton of segfaults. Most of the times I'm able to reproduce the bug with little to no problem, but today I hit a wall.
My code produces segfaults inconsistently. I need that core dump it is talking about.
So I'm going on a hunt for a core dump, for my little precious a.out. And that is when I'm starting to pull my hair off.
My intuition would tell me, that core dump files should be stored somewhere in the working directory - which obviously isn't the case. After reading this, I happily typed:
ulimit -c 750000
And... nothing. Output of my program told me that it did the core dump - but I can't find it in cwd. So after reading this I learnt that I should do things to apport and core_pattern.
Changing core_pattern seems a bit too much for getting one core dump, I really don't wan't to mess with it, because I know I will forget about it later. And I tend to mess these things up really badly.
Apport has this magical property of chosing which core dumps are valuable and which are not. It's logs told me...
ERROR: apport (pid 7306) Sun Jan 3 14:42:12 2016: executable does not belong to a package, ignoring
...that my program isn't good enough for it.
Where is this core dump file?
Is there a way to get a core dump a single time manually, without having to set everything up? I rarely need those as files per se, GDB alone is enough most of the time. Something like let_me_look_at_the_core_dump <program name> would be great.
I'm already balding a little, so any help would be appreciated.
So, today I learnt:
ulimit resets after reopening the shell.
I did a big mistake in my .zshrc - zsh nested and reopened itself after typing some commands.
After fiddling a bit with this I also found solution to the second problem. Making a shell script:
ulimit -c 750000
./a.out
gdb ./a.out ./core
ulimit -c 0
echo "profit"
At first I thought my program had memory leaks. But I terminated all java processes and restarted Spring Tools Suite. I kept an eye on the task manager. In just a few minutes, javaw.exe had grown to 2,000,000 K Memory. The memory keeps going up, without issuing commands in STS. STS has literally ONLY been opened. I have no tabs open in it. The error log doesn't show any memory related errors. Upon closing STS javaw.exe DOES disappear from task manager and opening STS restarts the process over again around 150,000K, quickly jumping to 600,000K, then slowly growing and growing until it has consumed all my memory.
Any thoughts what might be causing this? I'm running a full system scan now just in case I've been compromised.
--edit--
This problem started around 10 AM Eastern and mysteriously went away at noon, when the security scan completed. No items were detected by the scan to lend an explanation to either the problem or its mysterious resolution. As of now javaw.exe is hovering at or around 700,000K. Very strange!
Sounds like a 2 hour bug! Be thankful it is gone but be sure to document it thoroughly if it occurs again. Sounds like a rough 2 hours you went through.
That is not completely unusual unfortunately. Because Eclipse is made up of a bunch of plug-ins some times a plug-in can go wild and start consuming memory and/or CPU. Using VisualVM (http://visualvm.java.net/) you can determine what is causing Eclipse to freak out. Depending on what it is, you might be able to disable that functionality. Because it could be so many different plug-ins it doesn’t surprise me you could not find any answers googling or looking here at StackOverflow.
I'm running tomcat 7 with apache 2.2 & mod_jk 1.2.26 on a debian-lenny x64 server with 2GB of RAM.
I've a strange problem with my server: every several hour & sometimes (under load) every several minutes, my tomcat ajp-connector pauses with a memory leak error, but seems this error also effects some other parts of system (e.g some other running applications also stop working) & I have to reboot the server to solve the problem for a while.
I've checked catalina.out for several days, but it seem's there is not a unique error pattern just before pausing ajp with this message:
INFO: Pausing ProtocolHandler ["ajp-bio-8009"]
Sometimes there is this message before pausing:
Exception in thread "ajp-bio-8009-Acceptor-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)...
& sometimes this one:
INFO: Reloading Context with name [] has started
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5482)
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:230)
at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3847)
at org.apache.catalina.loader.WebappLoader.backgroundProcess(WebappLoader.java:424)
at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1214)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1400)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1389)
at java.lang.Thread.run(Thread.java:619)
java.sql.SQLException: null, message from server: "Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug"...
& some other times the output messages related to some other parts of program.
I've checked my application source code & I don't guess it causes the problem, I've also checked memory usage using jConsole. The wanderfull point is that when server fails, is shows a lot of free memory on both heap & non-heap jvm memory space. As I told before, after crashing server, many other applications also fail & when I want to restart them it gives a resource temporary unavailable message (I've also checked my limits.conf file).
So I really really confused with this serious problem many days & i have really no more idea about it. So, can anybody please give me any kind of suggestion to solve this complicated & unknown problem ???
What could be the most possible reason for this error ?
What are your limits for number of processes?
Check them with uname -a and check maximum number of processes. If it's 1024, increase it.
Also, check the same thing for user which you are using to start it (for example, if you are using nobody user for your stuff, run su -c "ulimit -a" -s /bin/sh nobody to see what actually this user sees as limits). That should show you a problem (had it couple of days ago, totally missed to check this).
In the moment when that starts happening, you can also count all your running threads and processes for that user (or even better to monitor it using rrdtool or something else) with "ps -eLf | wc -l" which will give you back simple count of all processes and threads running on your system. This information, together with limits for all particular users, should solve your issue.
Use jvisualvm to check the heap usage of your jvm. If you see it slowly climbing over a period of time, that is a memory leak. Sometimes a memory leak is short term and eventually gets cleared up, only to start again.
If you see a sawtooth pattern, take a heap dump near the peak of the sawtooth, otherwise take a heapdump after the jvm has been running long enough to be at a high risk of and OOM error. Then copy that .hprof file to another machine and use the Eclipse MAT (Memory Analysis Tool) to open it up and identify likely culprits. You will still need to spend some time following refs in the data structure and also reading some Javadocs to figure out just what is using that Hashmap or List that is growing out of control. The sorting options are also useful to focus on the most likely problem areas.
There are no easy answers.
Note that there is also a command line tool included with the SUN jvm which can trigger a heapdump. And if you have a good profiler that can also be of use because memory leaks are usually in a piece of code that is executed frequently and therefore will show up as a hot spot in the profiler.
I finally found the problem: it was not actually a memory leak, but the limitation in number of allowed threads for the VPS was caused the problem. My server was a Xen vps with default limitation of 256 threads, so when it reached the maximum allowed threads, the supervisor was killed some of running threads (that was cause of stopping some of my running processes). By increasing number of allowed threads to 512, the problem totally solved (of course if I increase maxThreads in tomcat settings, its obvious that the problem will rise again).