PHP process memory leak - linux

I'm having a really strange problem when running a PHP script as a Daemon. First I want to say that I've been using those kind of scripts for several years now without any problem.
During the past weekend, I rebuilt one of our EC2 servers in AWS and I started to see some memory leaking from my daemon scripts..
I started monitoring one of them by adding a log on each cycle of my script.
System_Daemon::info("After a cycle peak : ".number_format((memory_get_peak_usage()/1024/1024), 2)."MB and real : ".number_format((memory_get_usage()/1024/1024), 2)."MB");
Both numbers from this log are showing the exact same number for each script cycle
[Nov 21 10:24:14] info: After a cycle peak : 5.31MB and real : 4.87MB
but when I look into the process on my system it's leaking memory. All the numbers regarding memory (VIRT,RES,SHR and %MEM) are going up until the process is shutdown by the system..
I really don't know where to start looking to fix this. The only difference I've seen before/after my server rebuild is that the PHP version has slightly changed from PHP 7.0.33-0ubuntu0.16.04.2 to PHP 7.0.33-0ubuntu0.16.04.7.
Can anyone help me understand what is going on?
Thanks.

Related

Can the core dump cause memory leak?

I recently did this to my system
ulimit -c unlimited
and it created as designed, a core file for the program I use, ever since, I have had random crashes to my program but I haven't had the possibility to check the core dump to see what errors it gave, as it does daily restart of the program, I assume the previous errors are gone, if they are not, please tell me so I can look them up.
But my question is: is there in any possible way that this new ulimit command I used, be the issue with the server crash? because for years ive runned the same program with no crashes and since this commmand, I have had random crashes from time to time that somewhat feels like it loops for around 5 minutes then restarts the program.
Any help is appreciated, as I cannot reproduce the issue

signal 11 in a linux shell remote site. How can I troubleshoot

I'm a bio major only recently doing major coding for research stuff. Our campus in order to support research has an on campus supercomputer for researcher use. I work remotely from this supercomputer and it uses a linux shell to access it and submit jobs. I'm writing a job submission script for the alignment of a lot of genomes using a program installed on the computer called Mauve. Now I've run a job on Mauve fine before and have altered the script from that job to fit this job. Only this time I keep getting this error
Storing raw sequence at
/scratch/addiseg/Elizabethkingia_clonalframe/rawseq16360.000
Sequence loaded successfully.
GCA_000689515.1_E27107v1_PRJEB5243_genomic.fna 4032057 base pairs.
Storing raw sequence at
/scratch/addiseg/Elizabethkingia_clonalframe/rawseq16360.001
Sequence loaded successfully.
e.anophelisGCA_000496055.1_NUH11_genomic.fna 4091484 base pairs.
Caught signal 11
Cleaning up and exiting!
Temporary files deleted.
So I've got no idea how to troubleshoot this. I'm so sorry if this is super basic and wasting time but I don't know how to troubleshoot this at a remote site. All possible solutions I've seen so far require me to access the hardware or software neither of which I can control.
My current submission script is this.
module load mauve
progressiveMauve --output=8elizabethkingia-alignment.etc.xmfa --output-guide-tree=8.elizabethkingia-alignment.etc.tree --backbone-output=8.elizabethkingia-alignment.etc.backbone --island-gap-size=100 e.anophelisGCA_000331815.1_ASM33181v1_genomicR26.fna GCA_000689515.1_E27107v1_PRJEB5243_genomic.fna e.anophelisGCA_000496055.1_NUH11_genomic.fna GCA_001596175.1_ASM159617v1_genomicsrr3240400.fna e.meningoseptica502GCA_000447375.1_C874_spades_genomic.fna e.meningoGCA_000367325.1_ASM36732v1_genomicatcc13253.fna e.anophelisGCA_001050935.1_ASM105093v1_genomicPW2809.fna e.anophelisGCA_000495935.1_NUHP1_genomic.fna

javaw.exe consumes memory on starting STS

At first I thought my program had memory leaks. But I terminated all java processes and restarted Spring Tools Suite. I kept an eye on the task manager. In just a few minutes, javaw.exe had grown to 2,000,000 K Memory. The memory keeps going up, without issuing commands in STS. STS has literally ONLY been opened. I have no tabs open in it. The error log doesn't show any memory related errors. Upon closing STS javaw.exe DOES disappear from task manager and opening STS restarts the process over again around 150,000K, quickly jumping to 600,000K, then slowly growing and growing until it has consumed all my memory.
Any thoughts what might be causing this? I'm running a full system scan now just in case I've been compromised.
--edit--
This problem started around 10 AM Eastern and mysteriously went away at noon, when the security scan completed. No items were detected by the scan to lend an explanation to either the problem or its mysterious resolution. As of now javaw.exe is hovering at or around 700,000K. Very strange!
Sounds like a 2 hour bug! Be thankful it is gone but be sure to document it thoroughly if it occurs again. Sounds like a rough 2 hours you went through.
That is not completely unusual unfortunately. Because Eclipse is made up of a bunch of plug-ins some times a plug-in can go wild and start consuming memory and/or CPU. Using VisualVM (http://visualvm.java.net/) you can determine what is causing Eclipse to freak out. Depending on what it is, you might be able to disable that functionality. Because it could be so many different plug-ins it doesn’t surprise me you could not find any answers googling or looking here at StackOverflow.

Memory leak with apache, tomcat & mod_jk & mysql

I'm running tomcat 7 with apache 2.2 & mod_jk 1.2.26 on a debian-lenny x64 server with 2GB of RAM.
I've a strange problem with my server: every several hour & sometimes (under load) every several minutes, my tomcat ajp-connector pauses with a memory leak error, but seems this error also effects some other parts of system (e.g some other running applications also stop working) & I have to reboot the server to solve the problem for a while.
I've checked catalina.out for several days, but it seem's there is not a unique error pattern just before pausing ajp with this message:
INFO: Pausing ProtocolHandler ["ajp-bio-8009"]
Sometimes there is this message before pausing:
Exception in thread "ajp-bio-8009-Acceptor-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)...
& sometimes this one:
INFO: Reloading Context with name [] has started
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5482)
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:230)
at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3847)
at org.apache.catalina.loader.WebappLoader.backgroundProcess(WebappLoader.java:424)
at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1214)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1400)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1389)
at java.lang.Thread.run(Thread.java:619)
java.sql.SQLException: null, message from server: "Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug"...
& some other times the output messages related to some other parts of program.
I've checked my application source code & I don't guess it causes the problem, I've also checked memory usage using jConsole. The wanderfull point is that when server fails, is shows a lot of free memory on both heap & non-heap jvm memory space. As I told before, after crashing server, many other applications also fail & when I want to restart them it gives a resource temporary unavailable message (I've also checked my limits.conf file).
So I really really confused with this serious problem many days & i have really no more idea about it. So, can anybody please give me any kind of suggestion to solve this complicated & unknown problem ???
What could be the most possible reason for this error ?
What are your limits for number of processes?
Check them with uname -a and check maximum number of processes. If it's 1024, increase it.
Also, check the same thing for user which you are using to start it (for example, if you are using nobody user for your stuff, run su -c "ulimit -a" -s /bin/sh nobody to see what actually this user sees as limits). That should show you a problem (had it couple of days ago, totally missed to check this).
In the moment when that starts happening, you can also count all your running threads and processes for that user (or even better to monitor it using rrdtool or something else) with "ps -eLf | wc -l" which will give you back simple count of all processes and threads running on your system. This information, together with limits for all particular users, should solve your issue.
Use jvisualvm to check the heap usage of your jvm. If you see it slowly climbing over a period of time, that is a memory leak. Sometimes a memory leak is short term and eventually gets cleared up, only to start again.
If you see a sawtooth pattern, take a heap dump near the peak of the sawtooth, otherwise take a heapdump after the jvm has been running long enough to be at a high risk of and OOM error. Then copy that .hprof file to another machine and use the Eclipse MAT (Memory Analysis Tool) to open it up and identify likely culprits. You will still need to spend some time following refs in the data structure and also reading some Javadocs to figure out just what is using that Hashmap or List that is growing out of control. The sorting options are also useful to focus on the most likely problem areas.
There are no easy answers.
Note that there is also a command line tool included with the SUN jvm which can trigger a heapdump. And if you have a good profiler that can also be of use because memory leaks are usually in a piece of code that is executed frequently and therefore will show up as a hot spot in the profiler.
I finally found the problem: it was not actually a memory leak, but the limitation in number of allowed threads for the VPS was caused the problem. My server was a Xen vps with default limitation of 256 threads, so when it reached the maximum allowed threads, the supervisor was killed some of running threads (that was cause of stopping some of my running processes). By increasing number of allowed threads to 512, the problem totally solved (of course if I increase maxThreads in tomcat settings, its obvious that the problem will rise again).

High %wa CPU load when running PHP as CLI

Sorry for the vague question, but I've just written some php code that executes itself as CLI, and I'm pretty sure it's misbehaving. When I run "top" on the command line it's showing very little resources given to any individual process, but between 40-98% to iowait time (%wa). I usually have about .7% distributed between %us and %sy, with the remaining resources going to idle processes (somewhere between 20-50% usually).
This server is executing MySQL queries in, easily, 300x the time it takes other servers to run the same query, and it even takes what seems like forever to log on via SSH... so despite there being some idle cpu time left over, it seems clear that something very bad is happening. Whatever scripts are running, are updating my MySQL database, but it seems to be exponentially slower then when they started.
I need some ideas to serve as launch points for me to diagnose what's going on.
Some things that I would like to know are:
How I can confirm how many scripts are actually running
Is there anyway to confirm that these scripts are actually shutting down when they are through, and not just "hanging around" taking up CPU time and memory?
What kind of bottlenecks should I be checking to make sure I don't create too many instances of this script so this doesn't happen again.
I realize this is probably a huge question, but I'm more then willing to follow any links provided and read up on this... I just need to know where to start looking.
High iowait means that your disk bandwidth is saturated. This might be just because you're flooding your MySQL server with too many queries, and it's maxing out the disk trying to load the data to execute them.
Alternatively, you might be running low on physical memory, causing large amounts of disk IO for swapping.
To start diagnosing, run vmstat 60 for 5 minutes and check the output - the si and so columns show swap-in and swap-out, and the bi and bo lines show other IO. (Edit your question and paste the output in for more assistance).
High iowait may mean you have a slow/defective disk. Try checking it out with a S.M.A.R.T. disk monitor.
http://www.linuxjournal.com/magazine/monitoring-hard-disks-smart
ps auxww | grep SCRIPTNAME
same.
Why are you running more than one instance of your script to begin with?

Resources