Empty lxc core dump - linux

Good day.
I cannot achieve getting core dump file of any process launched from lxc container.
Here are my settings (inside container):
$ cat /proc/sys/kernel/core_pattern
/var/crash/coredump-%e.%p
$ ulimit -c
unlimited
$ ls -lha /var/crash/
drwxrwxrwx 2 root root 13 Oct 14 23:34 .
...
I start a C-written program that causes core dump inside container:
$ ./c
Segmentation fault
And don't get a message "core dumped" (unlike in base system, outside container everything is just perfect).
But a new file is generated in core dump path and it has a zero size:
$ ls -lha /var/crash/*
-rw------- 1 root root 0 Oct 14 23:43 /var/crash/coredump-c.1866
This behavior is similar for php5-fpm process, it also generates a zero-sized core dump when crashes, yet the limit for the pool is set to "unlimited" in pool settings.
Is there anything I've missed? Exploring the Internet for something like "limits lxc" or "core dump lxc" didn't get me anything relevant. Thanks!

Related

Linux Named Pipe Mounted on Docker Volume Showing as Regular File

I am trying to use a named pipe to run certain commands from a dockerised guest application to the host.
I am aware of the risks and this is not public facing, so please no comments about not doing this.
I have a named pipe configured on the host using:
sudo mkfifo -m a+rw /path/to/pipe/file
When I check the created pipe permissions with ls -la file, it shows the pipe has been created and intended permissions are set.
prw-rw-rw- 1 root root 0 Feb 2 11:43 file
When I then test the input by catting a command into the pipe from the host, this runs successfully.
Input
echo "echo test" > file
Output
[!] Starting listening on named pipe: file
test
The problem appears to be within my docker container. I have created a volume and mounted the named pipe from the host. When I then start an sh session and ls -l however, the file named pipe appears to be a normal file without the p and permission properties present on the host.
/hostpipe # ls -la
total 12
drwxr-xr-x 2 root root 4096 Feb 1 16:25 .
drwxr-xr-x 1 root root 4096 Feb 2 11:44 ..
-rw-r--r-- 1 root root 11 Feb 2 11:44 file
Running the same and similar echo "echo test" > file does not work from within the guest.
The host is a Linux desktop on baremetal.
Linux desktop 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
And the guest is an Alpine image
FROM python:3.8-alpine
and
Linux b16a4357fcf5 5.15.0-58-generic #64-Ubuntu SMP Thu Jan 5 11:43:13 UTC 2023 x86_64 Linux
Any idea what is going wrong here?
The issue was how the container was being set up. I was using a regular volume used for persisting data not mounting drives and files. I had to change my definition to use the - type: bind
Using volumes without the bind parameter does not allow use of the host file system functionality and only allows data sharing.
Before
volumes:
- static_data:/vol/static
- ./web:/web
- /opt/named_pipes/:/hostpipe
After
volumes:
- static_data:/vol/static
- ./web:/web
- type: bind
source: /opt/named_pipes/
target: /hostpipe

Core dump size is 0 for non root user

For some reason, core dump is generated with 0 bytes size in Kubernetes pod running with non root user.
If I modify the pod to run as root, this works fine.
Is there a way to get this to work for non-root user?
This is my setting on node, these propagate to pods as /proc is mounted as hostpath
[root#ip-xx-xx-xx-xx /]# cat /proc/sys/kernel/core_pattern
/tmp/%h.%t.core
[root#ip-xx-xx-xx-xx /]# cat /proc/sys/fs/suid_dumpable
2
As a non-root user, I can create and modify files under this directory "/tmp".
But when I try to generate a core dump with "kill -6 ", core dump is generated with size 0
-rw-------. 1 root root 0 Aug 11 17:26 app1-cluster1.core.8
If I modify my k8s deployment to run as root user (i.e. runAsUser: 0), I can generate coredump successfully
-rw-------. 1 root root 219246592 Aug 11 18:12 app1-cluster1.core.8

Two identical NFS shares, but only one of the two gives Stale file handle errors

I have a Linux (raspbian) server:
$ uname -a
Linux hester 4.19.97-v7l+ #1294 SMP Thu Jan 30 13:21:14 GMT 2020 armv7l GNU/Linux
With two directories that have the same user/group/permissions:
$ ls -ld /mnt/storage/gitea/ /mnt/storage/hester/
drwxr-xr-x 2 nobody nogroup 26 Mar 2 10:20 /mnt/storage/gitea/
drwxr-xr-x 3 nobody nogroup 21 Feb 21 11:26 /mnt/storage/hester/
These two directories are exported with the same parameters in the exports file:
$ cat /etc/exports
/mnt/storage/hester 192.168.1.15(rw,sync,no_subtree_check)
/mnt/storage/gitea 192.168.1.15(rw,sync,no_subtree_check)
On another machine (the 192.168.1.15 mentioned in the exports file) I mount both, successfully :
$ mount /mnt/storage/gitea/
$ echo $?
0
$ mount /mnt/storage/hester/
$ echo $?
0
But now weird things happen:
$ ls -l /mnt/storage/
ls: cannot access '/mnt/storage/gitea': Stale file handle
total 0
d????????? ? ? ? ? ? gitea
drwxr-xr-x 3 nobody nogroup 21 Feb 21 11:26 hester
I really can't figure
what's the source of the error, and above all
where I could look for a difference between the two.
I'm open to suggestions for further investigations or answers for the my doubts. Thanks in advance for any useful input!
I finally found the solution, which was to explicitly add an fsid option in exports:
$ cat /etc/exports
/mnt/storage/hester 192.168.1.15(rw,sync,fsid=20,no_subtree_check)
/mnt/storage/gitea 192.168.1.15(rw,sync,fsid=21,no_subtree_check)
I'm not entirely sure as to the reason why this works. From the man page I get that "NFS needs to be able to identify each filesystem that it exports. Normally it will use a UUID for the filesystem (if the filesystem has such a thing) or the device number of the device holding the filesystem (if the filesystem is stored on the device)."
Both these mountpoints are on the same filesystem, so according to the man page they should have the same fsid, but this causes the same directory to be exported, so I think it means that each export needs to have a separate fsid.
One more note: /mnt/storage is an XFS filesystem over a RAID3, so this could also have made NFS confused about UUIDs of devices.

Programmatically create a btrfs file system whose root directory has a specific owner

Background
I have a test script that creates and destroys file systems on the fly, used in a suite of performance tests.
To avoid running the script as root, I have a disk device /dev/testdisk that is owned by a specific user testuser, along with a suitable entry in /etc/fstab:
$ ls -l /dev/testdisk
crw-rw---- 1 testuser testuser 21, 1 Jun 25 12:34 /dev/testdisk
$ grep testdisk /etc/fstab
/dev/testdisk /mnt/testdisk auto noauto,user,rw 0 0
This allows the disk to be mounted and unmounted by a normal user.
Question
I'd like my script (which runs as testuser) to programmatically create a btrfs file system on /dev/testdisk such that the root directory is owned by testuser:
$ mount /dev/testdisk /mnt/testdisk
$ ls -la /mnt/testdisk
total 24
drwxr-xr-x 3 testuser testuser 4096 Jun 25 15:15 .
drwxr-xr-x 3 root root 4096 Jun 23 17:41 ..
drwx------ 2 root root 16384 Jun 25 15:15 lost+found
Can this be done without running the script as root, and without resorting to privilege escalation (use of sudo) within the script?
Comparison to other file systems
With ext{2,3,4} it's possible to create a filesystem whose root directory is owned by the current user, with the following command:
mkfs.ext{2,3,4} -F -E root_owner /dev/testdisk
Workarounds I'd like to avoid (if possible)
I'm aware that I can use the btrfs-convert tool to convert an existing (possibly empty) ext{2,3,4} file system to btrfs format. I could use this workaround in my script (by first creating an ext4 filesystem and then immediately converting it to brtfs) but I'd rather avoid it if there's a way to create the btrfs file system directly.

core dump files on Linux: how to get info on opened files?

I have a core dump file from a process that has probably a file descriptor leak (it opens files and sockets but apparently sometimes forgets to close some of them). Is there a way to find out which files and sockets the process had opened before crashing? I can't easily reproduce the crash, so analyzing the core file seems to be the only way to get a hint on the bug.
If you have a core file and you have compiled the program with debugging options (-g), you can see where the core was dumped:
$ gcc -g -o something something.c
$ ./something
Segmentation fault (core dumped)
$ gdb something core
You can use this to do some post-morten debugging. A few gdb commands: bt prints the stack, fr jumps to given stack frame (see the output of bt).
Now if you want to see which files are opened at a segmentation fault, just handle the SIGSEGV signal, and in the handler, just dump the contents of the /proc/PID/fd directory (i.e. with system('ls -l /proc/PID/fs') or execv).
With these information at hand you can easily find what caused the crash, which files are opened and if the crash and the file descriptor leak are connected.
Your best bet is to install a signal handler for whatever signal is crashing your program (SIGSEGV, etc.).
Then, in the signal handler, inspect /proc/self/fd, and save the contents to a file. Here is a sample of what you might see:
Anderson cxc # ls -l /proc/8247/fd
total 0
lrwx------ 1 root root 64 Sep 12 06:05 0 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 1 -> /dev/pts/0
lrwx------ 1 root root 64 Sep 12 06:05 10 -> anon_inode:[eventpoll]
lrwx------ 1 root root 64 Sep 12 06:05 11 -> socket:[124061]
lrwx------ 1 root root 64 Sep 12 06:05 12 -> socket:[124063]
lrwx------ 1 root root 64 Sep 12 06:05 13 -> socket:[124064]
lrwx------ 1 root root 64 Sep 12 06:05 14 -> /dev/driver0
lr-x------ 1 root root 64 Sep 12 06:05 16 -> /temp/app/whatever.tar.gz
lr-x------ 1 root root 64 Sep 12 06:05 17 -> /dev/urandom
Then you can return from your signal handler, and you should get a core dump as usual.
One of the ways I jump to this information is just running strings on the core file. For instance, when I was running file on a core recently, due to the length of the folders I would get a truncated arguments list. I knew my run would have opened files from my home directory, so I just ran:
strings core.14930|grep jodie
But this is a case where I had a needle and a haystack.
If the program forgot to close those resources it might be because something like the following happened:
fd = open("/tmp/foo",O_CREAT);
//do stuff
fd = open("/tmp/bar",O_CREAT); //Oops, forgot to close(fd)
now I won't have the file descriptor for foo in memory.
If this didn't happen, you might be able to find the file descriptor number, but then again, that is not very useful because they are continuously changing, by the time you get to debug you won't know which file it actually meant at the time.
I really think you should debug this live, with strace, lsof and friends.
If there is a way to do it from the core dump, I'm eager to know it too :-)
You can try using strace to see the open, socket and close calls the program makes.
Edit: I don't think you can get the information from the core; at most it will have the file descriptors somewhere, but this still doesn't give you the actual file/socket. (Assuming you can distinguish open from closed file descriptors, which I also doubt.)
Recently during my error troubleshooting and analysis , my customer provided me a coredump which got generated in his filesystem and he went out of station in order to quickly scan through the file and read its contents i used the command
strings core.67545 > coredump.txt
and later i was able to open the file in file editor.
A core dump is a copy of the memory the process had access to when crashed. Depending on how the leak is occurring, it might have lost the reference to the handles, so it may prove to be useless.
lsof lists all currently open files in the system, you could check its output to find leaked sockets or files. Yes, you'd need to have the process running. You could run it with a specific username to easily discern which are the open files from the process you are debugging.
I hope somebody else has better information :-)
Another way to find out what files a process has opened - again, only during runtime - is looking into /proc/PID/fd/ , which contains symlinks to open files.

Resources