I'm trying to use oprofile to record cache misses in a large realtime app:
$ sudo opcontrol --no-vmlinux --event=LLC_MISSES:100000 --session-dir=/var/tmp/oprofile -c=5 --start
But when I look at the reports, it doesn't mention the cache misses. It only samples CPU_CLK_UNHALTED:
$ sudo opreport -l --session-dir=/var/tmp/oprofile
CPU: Intel Architectural Perfmon, speed 1596 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
samples % image name app name symbol name
63243 92.2946 no-vmlinux no-vmlinux /no-vmlinux
564 0.8231 libc-2.13.so libc-2.13.so /lib32/libc-2.13.so
(etc)
But --status claims that oprofile is sampling L2 misses:
$ sudo opcontrol --status
Daemon running: pid 3220
Event 0: LLC_MISSES:500000:65:1:1
Separate options: library
vmlinux file: none
Image filter: none
Call-graph depth: 5
What am I doing wrong? I can't get it to sample any of the other counters listed in ophelp either.
This is with oprofile 0.9.6 on Ubuntu, kernel version 2.6.38.
Turns out you need to actually kill and restart the oprofile daemon with
sudo opcontrol --stop
sudo opcontrol --reset
sudo opcontrol --shutdown
sudo opcontrol --start-daemon
sudo opcontrol --start
when changing sampled events. Simply stopping and starting the profile isn't enough. Not that this is documented anywhere.
Related
I am using Manjaro Linux and a while ago pulseaudio has stopped working (after an update, I assume, I did not change any configuration manually).
The symptoms:
In KDE, no sound device is found
Pulseaudio volume control cannot connect to Pulseaudio
ps x | grep pulse returns no running processes.
pulseaudio -k says no pulseaudio process is running.
However, Applications still play music using the speakers. I just cannot adjust the volume.
When running "pulseaudio" I get the following output
W: [pulseaudio] pid.c: Stale PID file, overwriting.
N: [pulseaudio] alsa-util.c: Disabling timer-based scheduling because running inside a VM.
N: [pulseaudio] alsa-util.c: Disabling timer-based scheduling because running inside a VM.
N: [pulseaudio] alsa-util.c: Disabling timer-based scheduling because running inside a VM.
zsh: killed pulseaudio
When I run "pulseaudio -D -v" I get the following output:
[pulseaudio] main.c: Daemon startup successful.
When I run "pulseaudio -k" afterwards it says it cannot abort the process because it could not been found. Also, "ps x | grep pulse" does not list any running process.
Any hints how to fix this?
chrt -p 14490
pid 14490's current scheduling policy: SCHED_OTHER
pid 14490's current scheduling priority: 0`
I am trying to change the the scheduling priority of this process to SCHED_RR using the below command and running into the following error.
chrt -r -p 25 14490
chrt: failed to set pid 14490's policy: Operation not permitted
How can I debug why this is failing ?
You failed to specify your Linux version...
... but here are a few options:
https://unix.stackexchange.com/questions/114643/chrt-failed-to-set-pid-xxxs-policy-on-one-machine-but-not-others
sysctl -w kernel.sched_rt_runtime_us=-1
https://lists.opensuse.org/opensuse-security/2011-04/msg00015.html
... and ...
https://www.linuxquestions.org/questions/slackware-14/chrt-from-shell-scripts-operation-not-permitted-4175590174/
I tested on a virtualized slackware 14.2. No error. I upgraded to
util-linux-2.28.2 from current and then I had that error.
Upstream commit: https://github.com/karelzak/util-lin...ec919bec94089f
Marking thread as solved.
In other words:
You can try sysctl -w kernel.sched_rt_runtime_us=-1
But there are at least two reported instances where this was a bug: in slackware, and in util-linux. The solution in both cases was to update the Linux version.
I want Docker to start with systemd cgroup driver. For some reason it is using only cgroupfs on my CentOS 7 server.
Here is startup config file.
# systemctl cat docker
# /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target
Wants=docker-storage-setup.service
Requires=docker-cleanup.timer
[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
EnvironmentFile=-/etc/sysconfig/docker-network
Environment=GOTRACEBACK=crash
Environment=DOCKER_HTTP_HOST_COMPAT=1
Environment=PATH=/usr/libexec/docker:/usr/bin:/usr/sbin
ExecStart=/usr/bin/dockerd-current \
--add-runtime docker-runc=/usr/libexec/docker/docker-runc-current \
--default-runtime=docker-runc \
--exec-opt native.cgroupdriver=systemd \
--userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
$OPTIONS \
$DOCKER_STORAGE_OPTIONS \
$DOCKER_NETWORK_OPTIONS \
$ADD_REGISTRY \
$BLOCK_REGISTRY \
$INSECURE_REGISTRY
ExecReload=/bin/kill -s HUP $MAINPID
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=0
Restart=on-abnormal
MountFlags=slave
[Install]
WantedBy=multi-user.target
# /etc/systemd/system/docker.service.d/docker-thinpool.conf
[Service]
ExecStart=
ExecStart=/usr/bin/dockerd --storage-driver=devicemapper --storage-opt=dm.thinpooldev=/dev/mapper/docker-thinpool \
--storage-opt=dm.use_deferred_removal=true --storage-opt=dm.use_deferred_deletion=true
EOF
When I start Docker, it's running like this:
# ps -fed | grep docker
root 8436 1 0 19:13 ? 00:00:00 /usr/bin/dockerd-current --storage-driver=devicemapper --storage-opt=dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt=dm.use_deferred_removal=true --storage-opt=dm.use_deferred_deletion=true
root 8439 8436 0 19:13 ? 00:00:00 /usr/bin/docker-containerd-current -l unix:///var/run/docker/libcontainerd/docker-containerd.sock --shim docker-containerd-shim --metrics-interval=0 --start-timeout 2m --state-dir /var/run/docker/libcontainerd/containerd --runtime docker-runc
Here is the output of docker info:
# docker info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 1.12.6
Storage Driver: devicemapper
Pool Name: docker-thinpool
Pool Blocksize: 524.3 kB
Base Device Size: 10.74 GB
Backing Filesystem: xfs
Data file:
Metadata file:
Data Space Used: 185.6 MB
Data Space Total: 1.015 GB
Data Space Available: 829.4 MB
Metadata Space Used: 77.82 kB
Metadata Space Total: 8.389 MB
Metadata Space Available: 8.311 MB
Thin Pool Minimum Free Space: 101.2 MB
Udev Sync Supported: true
Deferred Removal Enabled: true
Deferred Deletion Enabled: true
Deferred Deleted Device Count: 0
Library Version: 1.02.135-RHEL7 (2016-11-16)
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: null bridge overlay host
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Security Options: seccomp
Kernel Version: 3.10.0-514.16.1.el7.x86_64
Operating System: CentOS Linux 7 (Core)
OSType: linux
Architecture: x86_64
Number of Docker Hooks: 2
CPUs: 1
Total Memory: 992.7 MiB
Name: master
ID: 6CFR:H7SN:MEU7:PNJH:UMSO:6MNE:43Q5:SF4K:Z25I:BKHP:53U4:63SO
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Insecure Registries:
127.0.0.0/8
Registries: docker.io (secure)
How can I make it run with systemd?
Thanks
SR
A solution that does not involve editing systemd units or drop-ins would be to create (or edit) the /etc/docker/daemon.json configuration file and to include the following:
{
"exec-opts": ["native.cgroupdriver=systemd"]
}
After saving it, restart your docker service.
sudo systemctl restart docker
This solution obviously is only feasible if you would want to apply this system-wide.
Since I have two configuration file I need to add the entry in the second config file also -- /etc/systemd/system/docker.service.d/docker-thinpool.conf:
--exec-opt native.cgroupdriver=systemd \
Just to add, cgroupfs is dockers own control group manager. However, for the majority of Linux distributions ssytemd is the default init system now and systemd has tight integration with Linux control groups and In Kubernetes site, they recommend using systemd (see below) as using cgroupfs along with systemd seems to be non-optimal
So it is better to use systemd then for cgroup managment. kubelet is configured by default to use systemd. So it is easier and better to change Docker to use the systemd Cgroup driver
A history of this overlap is here https://lwn.net/Articles/676831/
In Kubernetes site, they recommend using systemd https://kubernetes.io/docs/setup/production-environment/container-runtimes/
Cgroup drivers When systemd is chosen as the init system for a Linux
distribution, the init process generates and consumes a root control
group (cgroup) and acts as a cgroup manager. Systemd has a tight
integration with cgroups and will allocate cgroups per process. It’s
possible to configure your container runtime and the kubelet to use
cgroupfs. Using cgroupfs alongside systemd means that there will then
be two different cgroup managers.
Control groups are used to constrain resources that are allocated to
processes. A single cgroup manager will simplify the view of what
resources are being allocated and will by default have a more
consistent view of the available and in-use resources. When we have
two managers we end up with two views of those resources. We have seen
cases in the field where nodes that are configured to use cgroupfs for
the kubelet and Docker, and systemd for the rest of the processes
running on the node becomes unstable under resource pressure.
OS: Centos 7.4 As kubernetes 1.23.1 recommend to use cgroup systemd, and docker 20.10.20 use cgroup cgroupfs. So, you have to change docker service file.
step1: Stop docker service
systemctl stop docker
step2: change on files /etc/systemd/system/multi-user.target.wants/docker.service and /usr/lib/systemd/system/docker.service
From :
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
TO:
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock --exec-opt native.cgroupdriver=systemd
step3: start docker service and kubelet
systemctl start docker
kubeadm init phase kubelet-start
Make sure you are logged in as root and execute the below two commands :
echo '{"exec-opts": ["native.cgroupdriver=systemd"]}' >> /etc/docker/daemon.json
systemctl restart docker
Try to restart the docker service:
systemctl daemon-reload
systemctl restart docker.service
I try to get this running and don't know what I'm doing wrong. I have created an Debian.img (disk in raw format with virtual device manager - gui to libvirt I guess) and installed debian with no troubles. Now I want to get this running with a self compiled kernel. I copied the .config-file from my working (virtual) debian and made no more changes at all. This is what I do:
qemu-system-x86_64 -m 1024M -kernel /path/to/bzImage -hda /var/lib/libvirt/images/Debian.img -append "root=/dev/sda1 console=ttyS0" -enable-kvm -nographic
But during boot I always get this error message.
[ 0.195285] Initializing network drop monitor service
[ 0.196177] List of all partitions:
[ 0.196641] No filesystem could mount root, tried:
[ 0.197292] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
[ 0.198355] Pid: 1, comm: swapper/0 Not tainted 3.2.46 #7
[ 0.199055] Call Trace:
[ 0.199386] [<ffffffff81318c30>] ? panic+0x95/0x19e
[ 0.200049] [<ffffffff81680f7d>] ? mount_block_root+0x245/0x271
[ 0.200834] [<ffffffff8168112f>] ? prepare_namespace+0x133/0x169
[ 0.201590] [<ffffffff81680c94>] ? kernel_init+0x14c/0x151
[ 0.202273] [<ffffffff81325a34>] ? kernel_thread_helper+0x4/0x10
[ 0.203022] [<ffffffff81680b48>] ? start_kernel+0x3c1/0x3c1
[ 0.203716] [<ffffffff81325a30>] ? gs_change+0x13/0x13
What I'm doing wrong? Please someone help. Do I need to pass the -initrd option? I tried this already but had no luck yet.
I figured it out by myself. Some time has passed, but as I recall the solution was to provide an initial ramdisk. This is how I got it working with hardware acceleration.
Compiling
make defconfig
CONFIG_EXT4_FS=y
CONFIG_IA32_EMULATION=y
CONFIG_VIRTIO_PCI=y (Virtualization -> PCI driver for virtio devices)
CONFIG_VIRTIO_BALLOON=y (Virtualization -> Virtio balloon driver)
CONFIG_VIRTIO_BLK=y (Device Drivers -> Block -> Virtio block driver)
CONFIG_VIRTIO_NET=y (Device Drivers -> Network device support -> Virtio network driver)
CONFIG_VIRTIO=y (automatically selected)
CONFIG_VIRTIO_RING=y (automatically selected)
---> see http://www.linux-kvm.org/page/Virtio
Enable paravirt in config
Disable NMI watchdog on HOST for using performance counters on GUEST. You may ignore this.
cat /proc/sys/kernel/nmi_watchdog
---> see http://kvm.et.redhat.com/page/Guest_PMU
Start in Qemu
sudo qemu-system-x86_64 -m 1024M -hda /var/lib/libvirt/images/DEbian.img -enable-kvm -initrd /home/username/compiled_kernel/initrd.img-3.2.46 -kernel /home/username/compiled_kernel/bzImage -append "root=/dev/sda1 console=ttyS0" -nographic -redir tcp:2222::22 -cpu host -smp cores=2
Start in KVM
Kernal path: /home/username/compiled_kernel/bzImage
Initrd path: /home/username/compiled_kernel/initrd.img-3.2.46
Kernel arguments: root=/dev/sda1
Hope this helps if someone has the same issues.
This is for AArch64 (arm64) on QEMU case.
I was following this good tutorial: https://ibug.io/blog/2019/04/os-lab-1/
In my case I was met with this error message:
---[ end Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(1,0) ]---
I did mknod dev/ram b 1 0 in the initrd.
Later I noticed there was an error message above that line implying the kernel didn't support the ram disk. So I edited .config and set these items:
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_COUNT=1
CONFIG_BLK_DEV_RAM_SIZE=131072 (= 128MB, the number is in unit of 1014B)
And then the problem was gone! The initrd was mounted on /dev/ram and the first init process ran well.
It turns out running make defconfig didn't set thses values by default for me.
maybe your system image file is bad and can not be mounted.
You may try these command to mount the image file and check if it is a valid root file system for linux.
losetup /dev/loop0 /var/lib/libvirt/images/Debian.img
kpartx -av /dev/loop0
mount /dev/mapper/loop0p1 /mnt/tmp
The most likely thing is that the kernel doesn't know the correct device to boot from.
You can supply this explicitly from the qemu command line. So if the root is on partition 2, you can say:
qemu -kernel /path/to/bzImage \
-append root=/dev/sda2 \
-hda /path/to/hda.img \
.
.
.
Notice I use /dev/sda2 even though the disk is IDE. Even virtual machines seem to use SATA nowadays.
The other possibilities are that as #Houcheng says, your root FS is corrupted, or else that the kernel does not have that particular FS type built in. But I think you would get a different error if that were the case.
QEMU version
QEMU emulator version 2.5.0 (Debian 1:2.5+dfsg-5ubuntu10.11), Copyright (c) 2003-2008 Fabrice Bellard
running build-root 4.9.6 with the following arguments to be passed
qemu-system-x86_64 -kernel output/images/bzImage -hda output/images/rootfs.qcow2 -boot c -m 128 -append root=/dev/sda -localtime -no-reboot -name rtlinux -net nic -net user -redir tcp:2222::22 -redir tcp:3333::3333
was accepting only /dev/sda as an option for the root fs to mount (it will show you a little hint for the root fs option once it will boot and hang with the following error):
VFS: Cannot open root device "hda" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
0800 61440 sda driver: sd
I am trying to use oprofile 0.9.8 under Ubuntu 12.10 running on a Pentium D processor (Dell OptiPlex-GX620 desktop). When I try something simple like "operf ls" I get
perf_event_open failed with Invalid argument
Caught runtime_error: Internal Error. Perf event setup failed.
Error running profiler
I have success running oprofile in legacy mode using opcontrol commands under sudo by installing the module with timer=1 (see below).
It appears that operf is unhappy in this configuration - which is the new preferred method.
I have verified that all the dependent packages are loaded.
On the oprofile website they do not call out pentium D as a seperate architucture so not sure if it is using regular Pentium architecture.
I have searched everywhere and can't find anything like this reported. Any help would be appreciated in identifying the problem here.
P.S. When I run with legacy mode using opcontrol I have some success:
denham#denham-OptiPlex-GX620:~$ sudo opcontrol --start
ATTENTION: Use of opcontrol is discouraged. Please see the man page for operf.
Using default event: GLOBAL_POWER_EVENTS:100000:1:1:1
Error: counter 0 not available nmi_watchdog using this resource ? Try:
opcontrol --deinit
echo 0 > /proc/sys/kernel/nmi_watchdog
**When I force the module to be installed with timer=1**
denham#denham-OptiPlex-GX620:~$ sudo opcontrol --deinit
Unloading oprofile module
denham#denham-OptiPlex-GX620:~$ sudo modprobe oprofile timer=1
denham#denham-OptiPlex-GX620:~$ sudo opcontrol --no-vmlinux
denham#denham-OptiPlex-GX620:~$ sudo opcontrol --start
ATTENTION: Use of opcontrol is discouraged. Please see the man page for operf.
Using 2.6+ OProfile kernel interface.
Using log file /var/lib/oprofile/samples/oprofiled.log
Daemon started.
Profiler running.
denham#denham-OptiPlex-GX620:~$ ./a
^C
denham#denham-OptiPlex-GX620:~$ sudo opcontrol --shutdown
Stopping profiling.
Killing daemon.
denham#denham-OptiPlex-GX620:~$ opreport --callgraph
Using /var/lib/oprofile/samples/ for samples directory.
warning: /no-vmlinux could not be found.
warning: [vdso] (tgid:1697 range:0xb77ab000-0xb77ac000) could not be found.
warning: [vdso] (tgid:1728 range:0xb77b6000-0xb77b7000) could not be found.
warning: [vdso] (tgid:3310 range:0xb7702000-0xb7703000) could not be found.
CPU: CPU with timer interrupt, speed 2992.41 MHz (estimated)
Profiling through timer interrupt
samples % image name app name symbol name
-------------------------------------------------------------------------------
31878 81.1868 no-vmlinux no-vmlinux /no-vmlinux
31878 100.000 no-vmlinux no-vmlinux /no-vmlinux [self]
-------------------------------------------------------------------------------
2820 7.1820 a a main
2820 100.000 a a main [self]
-------------------------------------------------------------------------------
1065 2.7123 vino-server vino-server /usr/lib/vino/vino-server
1065 100.000 vino-server vino-server /usr/lib/vino/vino-server [self]
-------------------------------------------------------------------------------
1056 2.6894 a a b
1056 100.000 a a b [self]
-------------------------------------------------------------------------------
1013 2.5799 a a c
1013 100.000 a a c [self]
-------------------------------------------------------------------------------
968 2.4653 a a d
968 100.000 a a d [self]
-------------------------------------------------------------------------------
264 0.6724 libc-2.15.so libc-2.15.so /lib/i386-linux-gnu/libc-2.15.so
. . . . .
Don't know if this is the main problem but the error message says -
"Error: counter 0 not available nmi_watchdog using this resource.
Try: opcontrol --deinit echo 0 > /proc/sys/kernel/nmi_watchdog".
To get rid of this you have to disable NMI watchdog kernel parameter. On Ubuntu it is done via grub -
Edit /etc/default/grub and add “nmi_watchdog=0” to the GRUB_CMDLINE_LINUX.
Then run sudo update-grub and check the value with
cat /proc/sys/kernel/nmi_watchdog (should be "0"). Reboot to install new config if needed.