Using linux not to crash but freeze as in windows - linux

i have an script that is taking a lot of memory to use and it is just "killed" or cannot be assigned more memory. When i run that same script on windows it consumes my whole memory and freezes my pc. I want the same to happen in my linux server instead of killing the process.
I have tried changinc vm.overcommit_memory to 0, 1 and 2 but none of them work, i tried some other things like disabling the oom killer in linux but cant find the value vm.oom-killer, please help
update:
Another solution would be to limit the memory it is using for example 10gb but if it exceedes dont consume more or kill the process, just let if finish.

Related

Can a task that is killed incorrectly on Linux be considered a memory leak?

I use a Raspberry Pi [and run Ubunutu Server 20.04 LTS] quite often so it is advantageous to use memory as responsibly as possible. That being said, I run a number of processes that seem to run fairly efficiently with the 4GB of available memory at about ~2GB. Eventually, though, the memory usage seems to grow closer and closer to the 4GB level. While investgating memory usage with HTOP, I noticed something with the Python scrips I'm running (I've provided an image of what I'm describing); the processes seem to stack up.
Could this be because I'm using CTRL + Z rather than CTRL + C to restart my Python script?
Please let me know if I can be more specific.
Yes it's because you use ctrl-z. Use ctrl-c to interrupt your processes, by sending them SIGINT.
ctrl-z only puts your process into the background. It will keep running until it needs terminal input, then pause.
Try this when running some terminal program on your rPi. (It works with vi and many other programs.)
Press ctrl-z
Then do some shell commands. ls or whatever
Then type fg to resume your suspended process.
Believe it or not, this stuff works exactly the same on my rPi running GNU/Linux as it did on Bell Labs UNIX Seventh Edition on a PDP 11/70 back in 1976. But that computer had quite a bit less RAM.

Can the core dump cause memory leak?

I recently did this to my system
ulimit -c unlimited
and it created as designed, a core file for the program I use, ever since, I have had random crashes to my program but I haven't had the possibility to check the core dump to see what errors it gave, as it does daily restart of the program, I assume the previous errors are gone, if they are not, please tell me so I can look them up.
But my question is: is there in any possible way that this new ulimit command I used, be the issue with the server crash? because for years ive runned the same program with no crashes and since this commmand, I have had random crashes from time to time that somewhat feels like it loops for around 5 minutes then restarts the program.
Any help is appreciated, as I cannot reproduce the issue

How to prevent long-running backup job from being killed

How do you prevent a long-running memory-intensive tar-based backup script from getting killed?
I have a cron job that runs daily a command like:
tar --create --verbose --preserve-permissions --gzip --file "{backup_fn}" {excludes} / 2> /var/log/backup.log
It writes to an external USB drive. Normally the file generated is 100GB, but after I upgraded to Ubuntu 16, now the log file shows the process gets killed about 25% of the way through, presumably because it's consuming a lot of memory and/or putting the system under too much load.
How do I tell the kernel not to kill this process, or tweak it so it doesn't consume so many resources that it needs to be killed?
If you are certain about the fact that - the gets killed due to consuming too much memory, then you can try increasing the swappiness value in /proc/sys/vm/swappiness. By increasing swappiness you might able to get away from this scenario. You can also try tuning oom_kill_allocating_task, default is 0 , which tries to find out the rouge memory-hogging task and kills that one. If you change that one to 1, oom_killer will kill the calling task.
If none of the above works then you can try oom_score_adj under /proc/$pid/oom_score_adj. oom_score_adj accepts value range from -1000 to 1000. Lower the value less likely to be killed by oom_killer. If you set this value to -1000 then it disables oom killing. But, you should know what exactly you are doing.
Hope this will give you some idea.

Memory leak with apache, tomcat & mod_jk & mysql

I'm running tomcat 7 with apache 2.2 & mod_jk 1.2.26 on a debian-lenny x64 server with 2GB of RAM.
I've a strange problem with my server: every several hour & sometimes (under load) every several minutes, my tomcat ajp-connector pauses with a memory leak error, but seems this error also effects some other parts of system (e.g some other running applications also stop working) & I have to reboot the server to solve the problem for a while.
I've checked catalina.out for several days, but it seem's there is not a unique error pattern just before pausing ajp with this message:
INFO: Pausing ProtocolHandler ["ajp-bio-8009"]
Sometimes there is this message before pausing:
Exception in thread "ajp-bio-8009-Acceptor-0" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)...
& sometimes this one:
INFO: Reloading Context with name [] has started
Exception in thread "ContainerBackgroundProcessor[StandardEngine[Catalina]]" java.lang.OutOfMemoryError: unable to create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:597)
at org.apache.catalina.core.StandardContext.stopInternal(StandardContext.java:5482)
at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:230)
at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3847)
at org.apache.catalina.loader.WebappLoader.backgroundProcess(WebappLoader.java:424)
at org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1214)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1400)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1410)
at org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1389)
at java.lang.Thread.run(Thread.java:619)
java.sql.SQLException: null, message from server: "Can't create a new thread (errno 11); if you are not out of available memory, you can consult the manual for a possible OS-dependent bug"...
& some other times the output messages related to some other parts of program.
I've checked my application source code & I don't guess it causes the problem, I've also checked memory usage using jConsole. The wanderfull point is that when server fails, is shows a lot of free memory on both heap & non-heap jvm memory space. As I told before, after crashing server, many other applications also fail & when I want to restart them it gives a resource temporary unavailable message (I've also checked my limits.conf file).
So I really really confused with this serious problem many days & i have really no more idea about it. So, can anybody please give me any kind of suggestion to solve this complicated & unknown problem ???
What could be the most possible reason for this error ?
What are your limits for number of processes?
Check them with uname -a and check maximum number of processes. If it's 1024, increase it.
Also, check the same thing for user which you are using to start it (for example, if you are using nobody user for your stuff, run su -c "ulimit -a" -s /bin/sh nobody to see what actually this user sees as limits). That should show you a problem (had it couple of days ago, totally missed to check this).
In the moment when that starts happening, you can also count all your running threads and processes for that user (or even better to monitor it using rrdtool or something else) with "ps -eLf | wc -l" which will give you back simple count of all processes and threads running on your system. This information, together with limits for all particular users, should solve your issue.
Use jvisualvm to check the heap usage of your jvm. If you see it slowly climbing over a period of time, that is a memory leak. Sometimes a memory leak is short term and eventually gets cleared up, only to start again.
If you see a sawtooth pattern, take a heap dump near the peak of the sawtooth, otherwise take a heapdump after the jvm has been running long enough to be at a high risk of and OOM error. Then copy that .hprof file to another machine and use the Eclipse MAT (Memory Analysis Tool) to open it up and identify likely culprits. You will still need to spend some time following refs in the data structure and also reading some Javadocs to figure out just what is using that Hashmap or List that is growing out of control. The sorting options are also useful to focus on the most likely problem areas.
There are no easy answers.
Note that there is also a command line tool included with the SUN jvm which can trigger a heapdump. And if you have a good profiler that can also be of use because memory leaks are usually in a piece of code that is executed frequently and therefore will show up as a hot spot in the profiler.
I finally found the problem: it was not actually a memory leak, but the limitation in number of allowed threads for the VPS was caused the problem. My server was a Xen vps with default limitation of 256 threads, so when it reached the maximum allowed threads, the supervisor was killed some of running threads (that was cause of stopping some of my running processes). By increasing number of allowed threads to 512, the problem totally solved (of course if I increase maxThreads in tomcat settings, its obvious that the problem will rise again).

High %wa CPU load when running PHP as CLI

Sorry for the vague question, but I've just written some php code that executes itself as CLI, and I'm pretty sure it's misbehaving. When I run "top" on the command line it's showing very little resources given to any individual process, but between 40-98% to iowait time (%wa). I usually have about .7% distributed between %us and %sy, with the remaining resources going to idle processes (somewhere between 20-50% usually).
This server is executing MySQL queries in, easily, 300x the time it takes other servers to run the same query, and it even takes what seems like forever to log on via SSH... so despite there being some idle cpu time left over, it seems clear that something very bad is happening. Whatever scripts are running, are updating my MySQL database, but it seems to be exponentially slower then when they started.
I need some ideas to serve as launch points for me to diagnose what's going on.
Some things that I would like to know are:
How I can confirm how many scripts are actually running
Is there anyway to confirm that these scripts are actually shutting down when they are through, and not just "hanging around" taking up CPU time and memory?
What kind of bottlenecks should I be checking to make sure I don't create too many instances of this script so this doesn't happen again.
I realize this is probably a huge question, but I'm more then willing to follow any links provided and read up on this... I just need to know where to start looking.
High iowait means that your disk bandwidth is saturated. This might be just because you're flooding your MySQL server with too many queries, and it's maxing out the disk trying to load the data to execute them.
Alternatively, you might be running low on physical memory, causing large amounts of disk IO for swapping.
To start diagnosing, run vmstat 60 for 5 minutes and check the output - the si and so columns show swap-in and swap-out, and the bi and bo lines show other IO. (Edit your question and paste the output in for more assistance).
High iowait may mean you have a slow/defective disk. Try checking it out with a S.M.A.R.T. disk monitor.
http://www.linuxjournal.com/magazine/monitoring-hard-disks-smart
ps auxww | grep SCRIPTNAME
same.
Why are you running more than one instance of your script to begin with?

Resources