Is it better for php-fpm Unix Socket in ephemeral storage or EBS? - linux

I'm trying to tune my EC2 performance. One of it is to utilize the ephemeral storage for all I/O. For php-fpm, I'm utilizing unix socket instead of tcp/ip since everything is local. Considering EBS storage only has 24 IOPS (for 8GB), I'm wondering if it's better to move the php-fpm socket to ephemeral storage. Is there any I/O activity inside the unix socket file since the file size is always 0
[root# php-fpm]# ls -al
total 12
drwxr-xr-x 2 root root 4096 Aug 5 19:37 .
drwxr-xr-x 16 root root 4096 Aug 7 03:27 ..
-rw-r--r-- 1 root root 4 Aug 5 19:37 php-fpm.pid
srw-rw-rw- 1 nginx nginx 0 Aug 5 19:37 php-fpm.sock

EBS is a network based service, so every single operation depends on Network. The docs say:
An Amazon EBS volume is off-instance storage that can persist independently from the life of an instance.
Consider Ephemeral storage for your socket. If you use EBS, don't forget to allocate all disk with disk dupe before first use:
dd if=/dev/zero of=/dev/xvdf bs=1M
But don't do it on the root / disk, just on extra EBS disk if you prefer to use that.
P.S. How to warm up EBS, please read all details in the official docs.

Related

Can you bind the default network interface of the host into the container to read network stats?

I have a project where I read system information from the host inside a container. Right now I got CPU, RAM and Storage to work, but Network turns out to be a little harder. I am using the Node.js library https://systeminformation.io/network.html, which reads the network stats from /sys/class/net/.
The only solution that I found right now, is to use --network host, but that does not seem like the best way, because it breaks a lot of other networking related stuff and I cannot make the assumption that everybody who uses my project is fine with that.
I have tried --add-host=host.docker.internal:host-gateway as well, but while it does show up in /etc/hosts, it does not add a network interface to /sys/class/net/.
My knowledge on Docker and Linux is very limited, so does someone know if there is any other way?
My workaround for now is, to use readlink -f /sys/class/net/$(ip addr show | awk '/inet.*brd/{print $NF; exit}') to get the final path to the network statistics of the default interface and mount it to a imaginary path in the container. Therefore I don't use the mentioned systeminformation library for that right now. I would still like to have something that is a bit more reliable and in the best case officially supported by docker. I am fine with something that is not compatible with systeminformation, though.
There is a way to enter the host network namespace after starting the container. This can be used to run one process in the container in the container network namespace and another process in the host network namespace. Communication between the processes can be done using a unix domain socket.
Alternatively you can just mount a new instance of the sysfs which points to the host network namespace. If I understood correctly this is what you really need.
For this to work you need access to the host net namespace (I mount /proc/1/ns/net to the container for this purpose). Additionally the capabilities CAP_SYS_PTRACE and CAP_SYS_ADMIN are needed.
# /proc/1 is the 'init' process of the host which is always running in host network namespace
$ docker run -it --rm --cap-add CAP_SYS_PTRACE --cap-add CAP_SYS_ADMIN -v /proc/1/ns/net:/host_ns_net:ro debian:bullseye-slim bash
root#8b40f2f48808:/ ls -l /sys/class/net
lrwxrwxrwx 1 root root 0 Jun 2 21:09 eth0 -> ../../devices/virtual/net/eth0
lrwxrwxrwx 1 root root 0 Jun 2 21:09 lo -> ../../devices/virtual/net/lo
# enter the host network namespace
root#8b40f2f48808:/ nsenter --net=/host_ns_net bash
# now we are in the host network namespace and can see the host network interfaces
root#8b40f2f48808:/ mkdir /sys2
root#8b40f2f48808:/ mount -t sysfs nodevice /sys2
root#8b40f2f48808:/ ls -l /sys2/class/net/
lrwxrwxrwx 1 root root 0 Oct 25 2021 enp2s0 -> ../../devices/pci0000:00/0000:00:1c.1/0000:02:00.0/net/enp2s0
lrwxrwxrwx 1 root root 0 Oct 25 2021 enp3s0 -> ../../devices/pci0000:00/0000:00:1c.2/0000:03:00.0/net/enp3s0
[...]
root#8b40f2f48808:/ ls -l /sys2/class/net/enp2s0/
-r--r--r-- 1 root root 4096 Oct 25 2021 addr_assign_type
-r--r--r-- 1 root root 4096 Oct 25 2021 addr_len
-r--r--r-- 1 root root 4096 Oct 25 2021 address
-r--r--r-- 1 root root 4096 Oct 25 2021 broadcast
[...]
# Now you can switch back to the original network namespace
# of the container; the dir "/sys2" is still accessible
root#8b40f2f48808:/ exit
Putting this together for non-interactive usage:
Use the docker run with the following parameters:
docker run -it --rm --cap-add CAP_SYS_PTRACE --cap-add CAP_SYS_ADMIN -v /proc/1/ns/net:/host_ns_net:ro debian:bullseye-slim bash
Execute these commands in the container before starting your node app:
mkdir /sys2
nsenter --net=/host_ns_net mount -t sysfs nodevice /sys2
After nsenter (and mount) exits, you are back in the network namespace of the container. In theory you could drop the extended capabilities now.
Now you can access the network devices under /sys2/class/net.
You could mount the host's /sys/class/net/ directory as a volume in your container and patch the systeminformation package to read the contents of your custom path instead of the default path. The changes would need to be made in lib/network.js. You can see in that file how the directory is hardcoded throughout, just do a find/replace in your local copy to change all instances of the default path.
An easy way is to mount the whole "/sys" filesystem of the host into the container. Either mount them to a new location (e.g. /sys_host) or over-mount the original "/sys" in the container:
# docker run -it --rm -v /sys:/sys:ro debian:bullseye-slim bash
root#b84df3184dce:/# ls -l /sys/class/net/
lrwxrwxrwx 1 root root 0 Oct 25 2021 enp2s0 -> ../../devices/pci0000:00/0000:00:1c.1/0000:02:00.0/net/enp2s0
lrwxrwxrwx 1 root root 0 Oct 25 2021 enp3s0 -> ../../devices/pci0000:00/0000:00:1c.2/0000:03:00.0/net/enp3s0
[...]
root#b84df3184dce:/# ls -l /sys/class/net/enp2s0/
-r--r--r-- 1 root root 4096 Oct 25 2021 addr_assign_type
-r--r--r-- 1 root root 4096 Oct 25 2021 addr_len
-r--r--r-- 1 root root 4096 Oct 25 2021 address
-r--r--r-- 1 root root 4096 Oct 25 2021 broadcast
[...]
Please be aware that this way the container has access to the whole "/sys" filesystem of the host. The relative links from the network interface to the pci device still work.
If you don't need to write you should mount it read-only by appending ":ro" to the mounted path.

/var/log/daemon.log taking more space how to reduce it?

below are the files
-rw-r----- 1 root adm 4.4G Mar 6 09:04 daemon.log
-rw-r----- 1 root adm 6.2G Mar 1 06:26 daemon.log.1
-rw-r----- 1 root adm 50M Feb 23 06:26 daemon.log.2.gz
-rw-r----- 1 root adm 41M Feb 17 06:25 daemon.log.3.gz
-rw-r----- 1 root adm 72K Feb 9 06:25 daemon.log.4.gz
how can I remove it? will it affect if I directly delete it?
Thanks in advance.
Best way to manage the logs would be to use Logrotate
This is Serhii's comment on your other similar question:
Have a look at this Logrotate tutorial
linode.com/docs/uptime/logs/use-logrotate-to-manage-log-files. You can
use size to force log rotation when it grows bigger than the
specified [value], also you can use rotate to control how many
times a log is rotated before old logs are removed (If you set it to
0 logs will be removed immediately after they are rotated).
You can delete the logs but depending on the software you're running - if some of it needs some part of logs or utilises them in any way - if you delete them it will stop working as intended.
You can also have a look at the logs and analyse them to see which software writes the most data and try to reconfigure it so the number of logs info generated will drop significantly. That - combined with logrorate should yield satisfactory results.
And if that's not enough you can store your logs in a bucket and mount it as a disk in your VM's filesystem. That way any software installed on your VM will be able to write to it.
But this will incur some charges for using the bucket storage so keep that in mind.

Can't expose a fuse based volume to a Docker container

I'm trying to provide my docker container a volume of encrypted file system for internal use.
The idea is that the container will write to the volume as usual, but in fact the host will be encrypting the data before writing it to the filesystem.
I'm trying to use EncFS - it works well on the host, e.g:
encfs /encrypted /visible
I can write files to /visible, and those get encrypted.
However, when trying to run a container with /visible as the volume, e.g.:
docker run -i -t --privileged -v /visible:/myvolume imagename bash
I do get a volume in the container, but it's on the original /encrypted folder, not going through the EncFS. If I unmount the EncFS from /visible, I can see the files written by the container. Needless to say /encrypted is empty.
Is there a way to have docker mount the volume through EncFS, and not write directly to the folder?
In contrast, docker works fine when I use an NFS mount as a volume. It writes to the network device, and not to the local folder on which I mounted the device.
Thanks
I am unable to duplicate your problem locally. If I try to expose an encfs filesystem as a Docker volume, I get an error trying to start the container:
FATA[0003] Error response from daemon: Cannot start container <cid>:
setup mount namespace stat /visible: permission denied
So it's possible you have something different going on. In any case, this is what solved my problem:
By default, FUSE only permits the user who mounted a filesystem to have access to that filesystem. When you are running a Docker container, that container is initially running as root.
You can use the allow_root or allow_other mount options when you mount the FUSE filesystem. For example:
$ encfs -o allow_root /encrypted /other
Here, allow_root will permit the root user to have acces to the mountpoint, while allow_other will permit anyone to have access to the mountpoint (provided that the Unix permissions on the directory allow them access).
If I mounted by encfs filesytem using allow_root, I can then expose that filesystem as a Docker volume and the contents of that filesystem are correctly visible from inside the container.
This is definitely because you started the docker daemon before the host mounted the mountpoint. In this case the inode for the directory name is still pointing at the hosts local disk:
ls -i /mounts/
1048579 s3-data-mnt
then if you mount using a fuse daemon like s3fs:
/usr/local/bin/s3fs -o rw -o allow_other -o iam_role=ecsInstanceRole /mounts/s3-data-mnt
ls -i
1 s3-data-mnt
My guess is that docker does some bootstrap caching of the directory names to inodes (someone who has more knowledge of this than can fill in this blank).
Your comment is correct. If you simply restart docker after the mounting has finished your volume will be correctly shared from host to your containers. (Or you can simply delay starting docker until after all your mounts have finished mounting)
What is interesting (but makes complete since to me now) is that upon exiting the container and un-mounting the mountpoint on the host all of my writes from within the container to the shared volume magically appeared (they were being stored at the inode on the host machines local disk):
[root#host s3-data-mnt]# echo foo > bar
[root#host s3-data-mnt]# ls /mounts/s3-data-mnt
total 6
1 drwxrwxrwx 1 root root 0 Jan 1 1970 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
[root#host s3-data-mnt]# docker run -ti -v /mounts/s3-data-mnt:/s3-data busybox /bin/bash
root#5592454f9f4d:/mounts/s3-data# ls -als
total 8
4 drwxr-xr-x 3 root root 4096 Sep 16 16:05 .
4 drwxr-xr-x 12 root root 4096 Sep 16 16:45 ..
root#5592454f9f4d:/s3-data# echo baz > beef
root#5592454f9f4d:/s3-data# ls -als
total 9
4 drwxr-xr-x 3 root root 4096 Sep 16 16:05 .
4 drwxr-xr-x 12 root root 4096 Sep 16 16:45 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 beef
root#5592454f9f4d:/s3-data# exit
exit
[root#host s3-data-mnt]# ls /mounts/s3-data-mnt
total 6
1 drwxrwxrwx 1 root root 0 Jan 1 1970 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
[root#host /]# umount -l s3-data-mnt
[root#host /]# ls -als
[root#ip-10-0-3-233 /]# ls -als /s3-stn-jira-data-mnt/
total 8
4 drwxr-xr-x 2 root root 4096 Sep 16 17:28 .
4 dr-xr-xr-x 28 root root 4096 Sep 16 17:06 ..
1 -rw-r--r-- 1 root root 4 Sep 16 17:11 bar
You might be able to work around this by wrapping the mount call in nsenter to mount it in the same Linux mount namespace as the docker daemon, eg.
nsenter -t "$PID_OF_DOCKER_DAEMON" encfs ...
The question is whether this approach will survive a daemon restart itself. ;-)

Postgresql 'main/pg_notify/0000': Stale NFS file handle

I have a Debian Wheezy computer running a Postgresql Server and NO NFS filesystems.
After rebooting the computer, the following error has appeared:
ls: cannot access 0000: Stale NFS file handle
516439 drwx------ 2 postgres postgres 8 Nov 12 20:25 .
516480 drwx------ 3 postgres postgres 4096 Nov 17 17:08 ..
? ?????????? ? ? ? ? ? 0000
The "/var/lib/postgresql/9.1/main/pg_notify/0000" file is STALE and I cannot remove it or do anything at all with it. In order to get rid of that file, I tried the following options:
Rebooting the computer in order to unmount the filesystem (as suggested in several forums) did not work.
Removing postgresql (apt-get -purge) did not do anything at all either.
Trying to manually remove that file does not work either (Stale NFS file handle).
This directory is part of a JFS partition over a ciphered volume managed by LVM.
The output for the fsck:
fsck.jfs version 1.1.15, 04-Mar-2011
processing started: 11/17/2014 20:22:30
Using default parameter: -p
The current device is: /
ujfs_rw_diskblocks: read 0 of 4096 bytes at offset 32768
ujfs_rw_diskblocks: read 0 of 4096 bytes at offset 61440
Superblock is corrupt and cannot be repaired
since both primary and secondary copies are corrupt.
Output for ls -l:
ls -l /var/lib/postgresql/9.1/main/pg_notify/0000
I would like to know...
Why do I have a problem with a NFS handle in a non-NFS partition?
Is there anyway in which I can get rid of that file (workarounds are more than welcome as well)?

Database backups not writing to disc, not enough space?

I just inherited an AIX project which I know very little about. I have a cronjob that has been failing for a few days now that does a full backup of my database(db2). Looking at the logs, I'm seeing this:
SQL2419N The target disk "/home/dbtmp/backups" has become full.
When checking out this directory:
(/var/spool/cron)> df -g /home/dbtmp
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/dbtmplv 10.00 0.96 91% 85 1% /home/dbtmp
The size of the previous backups:
(/var/spool/cron)> ll /home/dbtmp/backups
total 18365248
-rw------- 1 hsprd cics 4411498496 Feb 12 18:01 HSPRD.0.hsprd.NODE0000.CATN0000.20130212180036.001
-rw------- 1 hstrn cics 874287104 Feb 12 18:08 HSTRN.0.hstrn.NODE0000.CATN0000.20130212180747.001
-rw------- 1 hstst cics 3242835968 Feb 12 18:05 HSTST.0.hstst.NODE0000.CATN0000.20130212180443.001
What options to I have to fix this? Thank you.
As you can see, the size of your backup files exceeds the free space on the device. You need a larger device.

Resources