My webapp allows users to execute some arbitrary code in a sandbox. To prevent forkbombs, the application calls setrlimit and limits RLIMIT_NPROC to 50 before executing user code. This worked great in Ubuntu 12.04 up till Ubuntu 13.04. However, after upgrading to Ubuntu 13.10 (which ships with Apache 2.4 and Linux 3.11), we hit the limit of 50 www-data processes, even when Apache2 is idle!
The problem is most easily reproduced by running bash as user www-data with ulimit. First switch into user www-data and start bash:
jeroen#Ubuntu:/$ sudo su www-data
$ bash
www-data#Ubuntu:/$
Now gradually lower RLIMIT_NPROC until we hit problems:
#RLIMIT_NPROC=100: works fine
www-data#Ubuntu:/$ ulimit -u 100
www-data#Ubuntu:/$ ls
bin dev initrd.img lib64 mnt root srv usr vmlinuz.old
boot etc initrd.img.old lost+found opt run sys var
cdrom home lib media proc sbin tmp vmlinuz
#RLIMIT_NPROC=50: limit reached
www-data#Ubuntu:/$ ulimit -u 50
www-data#Ubuntu:/$ ls
bash: fork: retry: No child processes
bash: fork: retry: No child processes
bash: fork: Resource temporarily unavailable
Hence after setting RLIMIT_NPROC to 50, the process can no longer fork. This implies that there are already 50 or more processes running as user www-data. However, this does not seem to be the case, the server is just a blank idle Apache 2.4. According to ps, there are currently only 2 procs owned by www-data:
jeroen#Ubuntu:~$ ps aux | grep www-data
www-data 11473 0.0 0.5 631296 46164 ? Sl 14:28 0:01 /usr/sbin/apache2 -k start
www-data 11474 0.0 0.5 565656 45632 ? Sl 14:28 0:01 /usr/sbin/apache2 -k start
jeroen 12136 0.0 0.0 13644 956 pts/4 S+ 14:51 0:00 grep --color=auto www-data
So why is www-data is hitting the RLIMIT_NPROC limit of 50 in Apache 2.4, even when idle?
Found the problem thanks to the suggestion from #sarnold. My Application depends on mpm_prefork and up till Ubuntu 13.04, this module was automatically enabled when the apache2-mpm-prefork package is installed. I assumed this was still the case, but it turned out that it was running mpm_event.
It seems that in Apache 2.4 the packaging of MPM's has changed and mpm_prefork needs to be enabled manually after installation:
sudo a2dismod mpm_event
sudo a2enmod mpm_prefork
sudo service apache2 restart
Now the problems seem to have disappeared.
Related
Why some processes could not be migrated to a certain cpu by cpuset(7) while some processes could?
I found that these processes could not be really migrated to a certain cpu(Though when you check the cpuset filesystem,it seems ok.But if check the affinity of these processes by top or htop, you could find the cpuset does not work for these processes indeed.):
/sbin/init splash
/usr/sbin/rpc.idmapd
/lib/systemd/systemd-timesyncd
/lib/systemd/systemd-timesyncd
/usr/sbin/cups-browsed
/usr/sbin/sshd -D
/sbin/dhclient -d -q -sf /usr/lib/NetworkManager/nm-dhcp-helper -pf
/var/run/dhclient-
/usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-
sshd: john [priv]
sshd: john [priv]
sshd: john#notty
/usr/lib/openssh/sftp-server
lightdm --session-child 12 15
upstart-file-bridge --daemon --user
/usr/lib/gvfs/gvfsd-fuse /run/user/1000/gvfs -f -o big_writes
/usr/lib/at-spi2-core/at-spi-bus-launcher
/usr/bin/dbus-daemon --config-file=/etc/at-spi2/accessibility.conf --nofork --print-addre
/usr/lib/at-spi2-core/at-spi2-registryd --use-gnome-session
/usr/lib/update-notifier/system-crash-notification
/usr/lib/x86_64-linux-gnu/hud/hud-service
/usr/lib/dconf/dconf-service
/usr/lib/x86_64-linux-gnu/indicator-power/indicator-power-service
/usr/lib/x86_64-linux-gnu/indicator-power/indicator-power-service
/usr/lib/x86_64-linux-gnu/indicator-datetime/indicator-datetime-service
/usr/lib/x86_64-linux-gnu/indicator-sound/indicator-sound-service
/usr/lib/x86_64-linux-gnu/indicator-printers/indicator-printers-service
/usr/lib/evolution/evolution-source-registry
/usr/lib/evolution/evolution-source-registry
/usr/lib/colord/colord
/usr/lib/colord/colord
/usr/lib/evolution/evolution-calendar-factory
/usr/bin/gnome-software --gapplication-service
/usr/lib/unity-settings-daemon/unity-fallback-mount-helper
/usr/lib/gvfs/gvfs-udisks2-volume-monitor
/usr/lib/gvfs/gvfs-udisks2-volume-monitor
/usr/lib/udisks2/udisksd --no-debug
/usr/lib/gvfs/gvfs-gphoto2-volume-monitor
/usr/lib/evolution/evolution-calendar-factory-subprocess --factory contacts --bus-name or
zeitgeist-datahub
I think that may because your computer use NUMA model rather than SMP model. This can solve the problem, but I'm not sure if that is the reason.
Run container:
[root#localhost ~]# tty
/dev/pts/3
[root#localhost ~]# docker run -it nginx /bin/bash
root#bee12031f933:/# sleep 20
root#bee12031f933:/#
See:
[root#localhost ~]# tty
/dev/pts/2
[root#localhost ~]# w
17:43:24 up 19 days, 45 min, 5 users, load average: 0.00, 0.01, 0.05
USER TTY FROM LOGIN# IDLE JCPU PCPU WHAT
root pts/0 192.168.1.22 16:24 1:01m 0.73s 0.00s sleep 20
root pts/1 192.168.1.22 11:31 1:02m 4.92s 4.65s docker run -it centos:7.7.1908
root pts/2 192.168.1.22 16:31 4.00s 0.70s 0.01s w
root pts/3 192.168.1.22 15:09 4.00s 0.25s 0.07s docker run -it nginx /bin/bash
root pts/4 192.168.1.22 16:41 44.00s 0.06s 0.06s -bash
Example picture:
enter image description here
enter image description here
docker container running in pts/3, execute command in container "sleep 20". then, i execute command "w" on the external host, display command "sleep 20" is executed in pts/0, what's the reason ?
why do external hosts display commands executed in containers ?
docker is similar to how LXC works. It allows sandboxing processes from one another, and controlling their resource allocations.
Since the resources are "separated", the system will show the information based on what it knows.
myuser#localhost: ~ $ tty
/dev/pts/1
myuser#localhost: ~ $ docker run --rm -it ubuntu:18.04 bash
root#36ed505961f4:/# tty
/dev/pts/0
Check the Kernel Namespaces for more info.
Tried to start redis-server but got:
26195:C 27 Aug 17:05:11.684 # Warning: no config file specified, using
the default config. In order to specify a config file use redis-server
/path/to/redis.conf
26195:M 27 Aug 17:05:11.684 * Increased maximum number of open files
to 10032 (it was originally set to 1024).
26195:M 27 Aug 17:05:11.685 # Creating Server TCP listening socket
*:6379: bind: Address already in use
Ran lsof -wni tcp:3000 and killed the local host and tried restarting redis-server again and got the same above error.
Tried: ps -aux | grep redis (output below), then sudo kill -9 6379
nick4896 12238 0.0 0.1 41432 9048 ? Sl Aug26 0:14
redis-server *:6379
nick4896 26304 0.0 0.0 21300 984 pts/21 S+ 17:08 0:00 grep
--color=auto redis
And ran sudo service redis-server restart, and got:
Failed to restart redis-server.service: Unit redis-server.service not
found.
Any ideas?
The problem is that symlink redis-server.service to redis.service was deleted.
Command
sudo systemctl enable redis-server
creates the symlink:
Created symlink /etc/systemd/system/redis.service → /lib/systemd/system/redis-server.service.
Came across this, I would suggest systemctl daemon-reload
Not an answer, but to complete Igor Kavzov's answer, this is the code to enter at the terminal:
sudo ln /lib/systemd/system/redis.service /etc/systemd/system/redis-server.service
Trying to stop Apache2 service, but get PID error:
#service apache2 stop
[FAIL] Stopping web server: apache2 failed!
[....] There are processes named 'apache2' running which do not match your pid file which are left untouched in the name of safety, Plea[warnview the situation by hand. ... (warning).
Trying to kill, those processes:
#kill -9 $(ps aux | grep apache2 | awk '{print $2}')
but they get re-spawned again:
#ps aux | grep apache2
root 19279 0.0 0.0 4080 348 ? Ss 05:10 0:00 runsv apache2
root 19280 0.0 0.0 4316 648 ? S 05:10 0:00 /bin/sh /usr/sbin/apache2ctl -D FOREGROUND
root 19282 0.0 0.0 91344 5424 ? S 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
www-data 19284 0.0 0.0 380500 2812 ? Sl 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
www-data 19285 0.0 0.0 380500 2812 ? Sl 05:10 0:00 /usr/sbin/apache2 -D FOREGROUND
And though the processes are running i can't connect to the server on port 80. /var/log/apache2/error.log.1 has no new messages when i do the kill -9.
Before I tried to restart everything worked perfectly.
Running on Debian: Linux adara 3.2.0-4-amd64 #1 SMP Debian 3.2.54-2 x86_64 GNU/Linux
UPD:
also tried apache2ctl:
#/usr/sbin/apache2ctl -k stop
AH00526: Syntax error on line 76 of /etc/apache2/apache2.conf:
PidFile takes one argument, A file for logging the server process ID
Action '-k stop' failed.
The Apache error log may have more information.
but there is no pid file in /var/run/apache2
I'm new to linux, looks like it has to do something with startup scripts, but can't figure out what exactly.
Below is the command to find out the process running on port 80
lsof -i tcp:80
Kill the process with PID.Restart the system once to check if their is any start up script executing and using the Port 80 which is preventing you to start your service.
For start up scripts you can check
/etc/init.d/ or /etc/rc.local or crontab - e
You can try Apache official documentation for stop/restart operations.
link
$ jcmd -l
418 sun.tools.jcmd.JCmd -l
$ jstat -gcutil -t 10 250ms 1
10 not found
I am aware of the bug in jdk related to attaching jstat as root to a process running as a different user.
Here, this docker container has one user root and as can be seen below from the ps command, cassandra is running under root.
$ whoami
root
I have tried to do the following:
$ sudo -u root jcmd -l
Any help is appreciated.
Docker container is debian:jessie
running java version:
openjdk version "1.8.0_66-internal"
Here's the output of ps -ef:
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 17:40 ? 00:00:00 /bin/bash /run.sh
root 10 1 11 17:40 ? 00:02:25 java -ea -javaagent:/usr/share/c
root 375 0 0 17:49 ? 00:00:00 bash
root 451 375 0 18:00 ? 00:00:00 ps -ef
Aside: jstack successfully dumps out the stack traces of the threads.
I know at least two possible reasons why this can happen.
Java is run with -XX:+PerfDisableSharedMem option. This option helps sometimes to reduce JVM safepoint pauses, but it also makes JVM invisible to jps and jstat. This is a very likely case, because you are running Cassandra, and recent Cassandra has this option ON by default.
Java process has a different mount namespace, so that /tmp of Java process is not physically the same directory as /tmp of your shell. The directory /tmp/hsperfdata_root must be accessible in order to use jps or jstat. This is also a plausible reason since you are using docker containers.