Reading /dev/cpu/*/msr from userspace: operation not permitted - file-permissions

I am trying to write a simple application that can read msr registers, and am running this application from userspace.
I have loaded the msr module and given read permissions for everyone to /dev/cpu/*/msr. But still the user is not able to access these files but the root can.
The permissions look like this:
crw-r--r-- 1 root root 202, 0 sep 6 17:55 /dev/cpu/0/msr
crw-r--r-- 1 root root 202, 1 sep 6 17:55 /dev/cpu/1/msr
crw-r--r-- 1 root root 202, 2 sep 6 17:55 /dev/cpu/2/msr
crw-r--r-- 1 root root 202, 3 sep 6 17:55 /dev/cpu/3/msr
I keep getting "Operation not permitted" error message when I try to read these files from userspace but works fine when root tries to access them. What am I doing wrong? I am on Ubuntu 13.04 with kernel version 3.11.0.

Changes in the mainline Linux kernel since around 3.7 now require an executable to have capability CAP_SYS_RAWIO to open the MSR device file [2]. Besides loading the MSR kernel module and setting the appropriate file permissions on the msr device file, one must grant the CAP_SYS_RAWIO capability to any user executable that needs access to the MSR driver, using the command below:
sudo setcap cap_sys_rawio=ep <user_executable>

For me (on debian) it helped to set the device permissions after loading the msr module. In addition to the answer of PaulUTK, as root:
setcap cap_sys_rawio=ep <user_executable>
Setting device permission (check before):
ls -l /dev/cpu/*/msr
crw------- ... /dev/cpu/0/msr
I added a group msr and assigned it. As root:
chgrp msr /dev/cpu/*/msr
chmod g+rw /dev/cpu/*/msr
ls -l /dev/cpu/*/msr
crw-rw---- ... /dev/cpu/0/msr
Assign the group to the user:
usermod -aG msr hardworkinguser
Bonus hint:
Apply the group as the hardworkinguser without relogin:
newgrp msr
I also heard secure boot must be disabled.

Responding to the following in the answer from user Benjamin Peter:
I also heard secure boot must be disabled.
With AlmaLinux 8.7 and a 4.18.0-425.3.1.el8.x86_64 kernel was able to read a MSR when secure boot was enabled.
read_smi_count.c is the code for program tested. Was able to run it to successfully read the MSR_SMI_COUNT (0x34) register. The following is the output after had built the program, which prompts what needs to be done to give the user program access to read the MSR register:
[mr_halfword#skylake-alma release]$ read_smi_count/read_smi_count
Error: No permission to open /dev/cpu/0/msr. Try:
sudo chmod o+r /dev/cpu/0/msr
[mr_halfword#skylake-alma release]$ sudo chmod o+r /dev/cpu/0/msr
[sudo] password for mr_halfword:
[mr_halfword#skylake-alma release]$ read_smi_count/read_smi_count
Error: No permission to open /dev/cpu/0/msr. Try:
sudo setcap cap_sys_rawio=ep read_smi_count/read_smi_count
[mr_halfword#skylake-alma release]$ sudo setcap cap_sys_rawio=ep read_smi_count/read_smi_count
[mr_halfword#skylake-alma release]$ read_smi_count/read_smi_count
SMI COUNT = 15240
The output of dmesg confirms the Kernel is locked down as a result of EFI secure boot being enabled:
mr_halfword#skylake-alma release]$ dmesg|grep lockdown
[ 0.000000] Kernel is locked down from EFI secure boot; see man kernel_lockdown.7
[ 1.578247] Lockdown: swapper/0: Hibernation is restricted; see man kernel_lockdown.7
[ 37.750117] Lockdown: x86_energy_perf: Direct MSR access is restricted; see man kernel_lockdown.7
The lockdown mode is integrity:
[mr_halfword#skylake-alma release]$ cat /sys/kernel/security/lockdown
none [integrity]
The above output doesn't show the option of a lockdown mode of confidentiality. Haven't investigated if confidentiality mode would prevent reading of MSR registers.
Linux kernel lockdown, integrity, and confidentiality notes that confidentiality mode applies additional restrictions to prevent reading of secrets from the Kernel.

You can see vfs_read:
ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos)
{
ret = rw_verify_area(READ, file, pos, count);
if (ret >= 0) {
...
if (file->f_op->read) // your driver read .
ret = file->f_op->read(file, buf, count, pos);
else
ret = do_sync_read(file, buf, count, pos);
....
}
// here, if the ret is 13. your error will be occur.
return ret;
}

Related

How to troubleshoot an expected CDROM device on custom Linux kernel?

I'm looking for some hints while troubleshooting missing CDROM device.
The problem is, missing configuration option for my custom kernel (linux-5.4.78).
My current .config has:
CONFIG_CDROM=y
CONFIG_BLK_DEV_SR=y
CONFIG_VHOST_SCSI=y
CONFIG_BLK_SCSI_REQUEST=y
CONFIG_SCSI_MOD=y
CONFIG_SCSI=y
CONFIG_SCSI_DMA=y
CONFIG_SCSI_NETLINK=y
CONFIG_SCSI_PROC_FS=y
CONFIG_SCSI_SPI_ATTRS=y
CONFIG_SCSI_FC_ATTRS=y
CONFIG_SCSI_ISCSI_ATTRS=y
CONFIG_SCSI_SAS_ATTRS=y
CONFIG_SCSI_SAS_LIBSAS=y
CONFIG_SCSI_SAS_HOST_SMP=y
CONFIG_SCSI_LOWLEVEL=y
CONFIG_ISCSI_TCP=y
CONFIG_ISCSI_BOOT_SYSFS=y
CONFIG_SCSI_CXGB3_ISCSI=y
CONFIG_SCSI_CXGB4_ISCSI=y
CONFIG_SCSI_BNX2_ISCSI=y
CONFIG_BE2ISCSI=y
CONFIG_SCSI_HPSA=y
CONFIG_SCSI_3W_9XXX=y
CONFIG_SCSI_3W_SAS=y
CONFIG_SCSI_ACARD=y
CONFIG_SCSI_AACRAID=y
CONFIG_SCSI_AIC7XXX=y
CONFIG_SCSI_AIC79XX=y
CONFIG_SCSI_AIC94XX=y
CONFIG_SCSI_HISI_SAS=y
CONFIG_SCSI_HISI_SAS_PCI=y
CONFIG_SCSI_MVSAS=y
CONFIG_SCSI_MVSAS_TASKLET=y
CONFIG_SCSI_MVUMI=y
CONFIG_SCSI_DPT_I2O=y
CONFIG_SCSI_ADVANSYS=y
CONFIG_SCSI_ARCMSR=y
CONFIG_SCSI_ESAS2R=y
CONFIG_SCSI_MPT3SAS=y
CONFIG_SCSI_MPT2SAS=y
CONFIG_SCSI_SMARTPQI=y
CONFIG_SCSI_UFSHCD=y
CONFIG_SCSI_UFSHCD_PCI=y
CONFIG_SCSI_UFSHCD_PLATFORM=y
CONFIG_SCSI_UFS_CDNS_PLATFORM=y
CONFIG_SCSI_UFS_HISI=y
CONFIG_SCSI_UFS_BSG=y
CONFIG_SCSI_HPTIOP=y
CONFIG_SCSI_BUSLOGIC=y
CONFIG_SCSI_FLASHPOINT=y
CONFIG_SCSI_MYRB=y
CONFIG_SCSI_MYRS=y
CONFIG_VMWARE_PVSCSI=y
CONFIG_SCSI_SNIC=y
CONFIG_SCSI_DMX3191D=y
CONFIG_SCSI_FDOMAIN=y
CONFIG_SCSI_FDOMAIN_PCI=y
CONFIG_SCSI_GDTH=y
CONFIG_SCSI_ISCI=y
CONFIG_SCSI_IPS=y
CONFIG_SCSI_INITIO=y
CONFIG_SCSI_INIA100=y
CONFIG_SCSI_PPA=y
CONFIG_SCSI_IMM=y
CONFIG_SCSI_STEX=y
CONFIG_SCSI_SYM53C8XX_2=y
CONFIG_SCSI_SYM53C8XX_MMIO=y
CONFIG_SCSI_IPR=y
CONFIG_SCSI_IPR_TRACE=y
CONFIG_SCSI_IPR_DUMP=y
CONFIG_SCSI_QLOGIC_1280=y
CONFIG_SCSI_QLA_FC=y
CONFIG_SCSI_QLA_ISCSI=y
CONFIG_SCSI_LPFC=y
CONFIG_SCSI_DC395x=y
CONFIG_SCSI_AM53C974=y
CONFIG_SCSI_WD719X=y
CONFIG_SCSI_PMCRAID=y
CONFIG_SCSI_PM8001=y
CONFIG_SCSI_BFA_FC=y
CONFIG_SCSI_VIRTIO=y
CONFIG_SCSI_CHELSIO_FCOE=y
CONFIG_SCSI_LOWLEVEL_PCMCIA=y
CONFIG_SCSI_DH=y
CONFIG_SCSI_DH_RDAC=y
CONFIG_SCSI_DH_HP_SW=y
CONFIG_SCSI_DH_EMC=y
CONFIG_SCSI_DH_ALUA=y
CONFIG_ISCSI_TARGET=y
CONFIG_ISCSI_TARGET_CXGB4=y
CONFIG_QED_ISCSI=y
I'm expecting to see /dev/sr0. It's not there. dmesg is mute about sr0.
However, I'm able to see it using stock kernel and I've identified it was bring by BLK_DEV_SR on my target:
# ls -l /dev/sr0
brw-rw---- 1 root optical 11,0 Apr 21 15:02 /dev/sr0
# readlink /sys/dev/block/11\:0/device/driver
../../../../../../../../../../../../bus/scsi/driver/sr
I'd appreciate any help.
If your custom linux has udev, try udevadm monitor.
When you eject or insert a cd, you should see a change event on the terminal with the device path.
Also it's normally standard for a cdrom drive, no matter the actual device path, to be forwarded to /media/cdrom

/dev/ttyACM0: permission denied on openSUSE

I am trying to use an "Arduboy," based on the Arduino Leonardo, with the Arduino IDE. I cannot upload the example code, however, because of the following error:
avrdude: ser_open(): can't open device "/dev/ttyACM0": Permission denied
Problem uploading to board. See http://www.arduino.cc/en/Guide/Troubleshooting#upload for suggestions.
Before you mark this as a duplicate, here are all of the things I have tried
Adding myself to the dialout group that /dev/ttyACM0 can be modified by
Running chmod a+rw /dev/ttyACM0 every time I plug in the board
Making this udev rule: KERNEL=="ttyACM0", MODE="0666"
None of these things worked. What did work was running it with xdg-su like so: xdg-su -c ./arduino. However, I think it's not the best idea to run the thing as root every time. Is there anything I can do?
I am running openSUSE Tumbleweed.
The Arduino Leonardo based boards interrupt the communication with ttyACM* for a short moment (like logging in and out) during an upload. For some reason, the permissions change during this process. See the output of a repeated ls -l --full-time /dev/ttyACM0 during a failed upload:
crw-rw-rw- 1 root dialout 166, 0 2019-08-11 17:28:31.974025089 +0200 /dev/ttyACM0
ls: cannot access '/dev/ttyACM0': No such file or directory
crw------- 1 root root 166, 0 2019-08-11 17:42:15.523439213 +0200 /dev/ttyACM0
crw-rw---- 1 root dialout 166, 0 2019-08-11 17:42:16.083442857 +0200 /dev/ttyACM0
I also use Tumbleweed. The only workaround that I currently know is to start the Arduino IDE as root.
Go to your arduino's program directory. Open terminal in the directory. Then type ./arduino-linux-setup.sh $USER. After that reboot. You are able to upload code onto your arduino.

Access urandom device get "permission denied", why?

I can create a new urandom device on a some directory (test_urandom in below example), and it works as expected. E.g.
test_urandom$ sudo mknod -m 0444 ./urandom c 1 9
test_urandom$ ls -l
total 0
cr--r--r-- 1 root root 1, 9 Jun 9 09:06 urandom
test_urandom$ head -c 10 ./urandom
�׫O1�9�^
However, if I create the same device node in another directory, which in my case is an ext4 filesystem on a LVM (Logical Volume Management), it failed and system complained with permission denied.
test_urandom_lvm$ sudo mknod -m 0444 ./urandom c 1 9
test_urandom_lvm$ ls -l
total 0
cr--r--r-- 1 root root 1, 9 Jun 9 09:06 urandom
test_urandom_lvm$ head -c 10 ./urandom
head: cannot open ‘./urandom’ for reading: Permission denied
If I am allowed to create a device in the filesystem, why not allowed to read the device? What caused the permission denied? What changes is needed to make it work?
The filesystem is mounted with the nodev option, which inhibits block and character special device operation. Mounting it dev will allow them to work.

Busybox SUID on NFS rootfs

I am building a Linux system from the bottom for a Beagle Bone board. I have compiled the vanilla kernel and built a basic root file system with busybox. The system is booted with U-boot, while the rootfs is located on a Linux PC and exported through NFS:
/path/to/rootfs 10.42.0.17(rw,wdelay,no_root_squash,no_subtree_check,sec=sys,rw,secure,no_root_squash,no_all_squash)
The U-boot bootargs are:
bootargs console=ttyO0,115200n8 root=/dev/nfs rw nfsroot=${serverip}:/path/to/rootfs,v3,tcp ip=dhcp
I've encountered a problem when trying to get su working for non-root users. In order to work around the problem people over internet are suggesting to set the suid bit for the busybox binary.
After doing so:
$ sudo chmod u+s busybox
and verifying:
$ ls -la
...
-rwsr-xr-x 1 myuser myuser 1882976 Jan 13 21:47 busybox
...
$ stat -c "%a %n" busybox
4755 busybox
Something went wrong. The kernel is booting and all of the usual messages are displayed, but it is getting stuck at the end, and no login line is displayed. Here are last few lines of the booting sequence:
[ 3.776185] IP-Config: Complete:
[ 3.779656] device=eth0, hwaddr=c8:a0:30:c5:80:e9, ipaddr=10.42.0.17, mask=255.255.255.0, gw=10.42.0.1
[ 3.789877] host=10.42.0.17, domain=, nis-domain=(none)
[ 3.795822] bootserver=10.42.0.1, rootserver=10.42.0.1, rootpath=
[ 3.802492] nameserver0=10.42.0.1
[ 3.871575] VFS: Mounted root (nfs filesystem) on device 0:15.
[ 3.879903] devtmpfs: mounted
[ 3.883713] Freeing unused kernel memory: 380K (c07ef000 - c084e000)
If removing the flag, the things are returning to normal:
....
[ 3.862291] Freeing unused kernel memory: 380K (c07ef000 - c084e000)
10.42.0.17 login:
If setting the flag from within the running shell on the Beagle Bone board itself, the shell is stopping responding right after the chmod is performed.
I suspect it is something to do with the way the NFS is exporting the rootfs, but it's only a guess, so qualified explanation and possible solution would be helpful.
After some research I will answer my question myself. The answer is very simple. In order the above to work, the busybox binary should be owned by root:root. The simplest solution is just to change the ownership.

Detecting a chroot jail from within

How can one detect being in a chroot jail without root privileges? Assume a standard BSD or Linux system. The best I came up with was to look at the inode value for "/" and to consider whether it is reasonably low, but I would like a more accurate method for detection.
[edit 20080916 142430 EST] Simply looking around the filesystem isn't sufficient, as it's not difficult to duplicate things like /boot and /dev to fool the jailed user.
[edit 20080916 142950 EST] For Linux systems, checking for unexpected values within /proc is reasonable, but what about systems that don't support /proc in the first place?
The inode for / will always be 2 if it's the root directory of an ext2/ext3/ext4 filesystem, but you may be chrooted inside a complete filesystem. If it's just chroot (and not some other virtualization), you could run mount and compare the mounted filesystems against what you see. Verify that every mount point has inode 2.
On Linux with root permissions, test if the root directory of the init process is your root directory. Although /proc/1/root is always a symbolic link to /, following it leads to the “master” root directory (assuming the init process is not chrooted, but that's hardly ever done). If /proc isn't mounted, you can bet you're in a chroot.
[ "$(stat -c %d:%i /)" != "$(stat -c %d:%i /proc/1/root/.)" ]
# With ash/bash/ksh/zsh
! [ -x /proc/1/root/. ] || [ /proc/1/root/. -ef / ]
This is more precise than looking at /proc/1/exe because that could be different outside a chroot if init has been upgraded since the last boot or if the chroot is on the main root filesystem and init is hard linked in it.
If you do not have root permissions, you can look at /proc/1/mountinfo and /proc/$$/mountinfo (briefly documented in filesystems/proc.txt in the Linux kernel documentation). This file is world-readable and contains a lot of information about each mount point in the process's view of the filesystem. The paths in that file are restricted by the chroot affecting the reader process, if any. If the process reading /proc/1/mountinfo is chrooted into a filesystem that's different from the global root (assuming pid 1's root is the global root), then no entry for / appears in /proc/1/mountinfo. If the process reading /proc/1/mountinfo is chrooted to a directory on the global root filesystem, then an entry for / appears in /proc/1/mountinfo, but with a different mount id. Incidentally, the root field ($4) indicates where the chroot is in its master filesystem. Again, this is specific to Linux.
[ "$(awk '$5=="/" {print $1}' </proc/1/mountinfo)" != "$(awk '$5=="/" {print $1}' </proc/$$/mountinfo)" ]
If you are not in a chroot and the root filesystem is ext2/ext3/ext4, the inode for / will always be 2. You may check that using
stat -c %i /
or
ls -id /
Interresting, but let's try to find path of chroot directory. Ask to stat on which device / is located:
stat -c %04D /
First byte is major of device and lest byte is minor. For example, 0802, means major 8, minor 1. If you check in /dev, you will see this device is /dev/sda2. If you are root you can directly create correspondong device in your chroot:
mknode /tmp/root_dev b 8 1
Now, let's find inode associated to our chroot. debugfs allows list contents of files using inode numbers. For exemple, ls -id / returned 923960:
sudo debugfs /tmp/root_dev -R 'ls <923960>'
923960 (12) . 915821 (32) .. 5636100 (12) var
5636319 (12) lib 5636322 (12) usr 5636345 (12) tmp
5636346 (12) sys 5636347 (12) sbin 5636348 (12) run
5636349 (12) root 5636350 (12) proc 5636351 (12) mnt
5636352 (12) home 5636353 (12) dev 5636354 (12) boot
5636355 (12) bin 5636356 (12) etc 5638152 (16) selinux
5769366 (12) srv 5769367 (12) opt 5769375 (3832) media
Interesting information is inode of .. entry: 915821. I can ask its content:
sudo debugfs /tmp/root_dev -R 'ls <915821>'
915821 (12) . 2 (12) .. 923960 (20) debian-jail
923961 (4052) other-jail
Directory called debian-jail has inode 923960. So last component of my chroot dir is debian-jail. Let's see parent directory (inode 2) now:
sudo debugfs /tmp/root_dev -R 'ls <2>'
2 (12) . 2 (12) .. 11 (20) lost+found 1046529 (12) home
130817 (12) etc 784897 (16) media 3603 (20) initrd.img
261633 (12) var 654081 (12) usr 392449 (12) sys 392450 (12) lib
784898 (12) root 915715 (12) sbin 1046530 (12) tmp
1046531 (12) bin 784899 (12) dev 392451 (12) mnt
915716 (12) run 12 (12) proc 1046532 (12) boot 13 (16) lib64
784945 (12) srv 915821 (12) opt 3604 (3796) vmlinuz
Directory called opt has inode 915821 and inode 2 is root of filesystem. So my chroot directory is /opt/debian-jail. Sure, /dev/sda1 may be mounted on another filesystem. You need to check that (use lsof or directly picking information /proc).
Preventing stuff like that is the whole point. If it's your code that's supposed to run in the chroot, have it set a flag on startup. If you're hacking, hack: check for several common things in known locations, count the files in /etc, something in /dev.
On BSD systems (check with uname -a), proc should always be present. Check if the dev/inode pair of /proc/1/exe (use stat on that path, it won't follow the symlink by text but by the underlying hook) matches /sbin/init.
Checking the root for inode #2 is also a good one.
On most other systems, a root user can find out much faster by attempting the fchdir root-breaking trick. If it goes anywhere you are in a chroot jail.
I guess it depends why you might be in a chroot, and whether any effort has gone into disguising it.
I'd check /proc, these files are automatically generated system information files. The kernel will populate these in the root filesystem, but it's possible that they don't exist in the chroot filesystem.
If the root filesystem's /proc has been bound to /proc in the chroot, then it is likely that there are some discrepancies between that information and the chroot environment. Check /proc/mounts for example.
Similrarly, check /sys.
I wanted the same information for a jail running on FreeBSD (as Ansible doesn't seem to detect this scenario).
On the FreeNAS distribution of FreeBSD 11, /proc is not mounted on the host, but it is within the jail. Whether this is also true on regular FreeBSD I don't know for sure, but procfs: Gone But Not Forgotten seems to suggest it is. Either way, you probably wouldn't want to try mounting it just to detect jail status and therefore I'm not certain it can be used as a reliable predictor of being within a jail.
I also ruled out using stat on / as certainly on FreeNAS all jails are given their own file system (i.e. a ZFS dataset) and therefore the / node on the host and in the jail both have inode 4. I expect this is common on FreeBSD 11 in general.
So the approach I settled on was using procstat on pid 0.
[root#host ~]# procstat 0
PID PPID PGID SID TSID THR LOGIN WCHAN EMUL COMM
0 0 0 0 0 1234 - swapin - kernel
[root#host ~]# echo $?
0
[root#host ~]# jexec guest tcsh
root#guest:/ # procstat 0
procstat: sysctl(kern.proc): No such process
procstat: procstat_getprocs()
root#guest:/ # echo $?
1
I am making an assumption here that pid 0 will always be the kernel on the host, and there won't be a pid 0 inside the jail.
If you entered the chroot with schroot, then you can check the value of $debian_chroot.

Resources