how to enable PMU in KVM guest - linux

I am running KVM/QEMU in my Lenovo X1 laptop.
The guest OS is Ubuntu 15.04 x86_64.
Now, I want to run 'perf' command in guest OS, but I found followings in guest OS in dmesg.
...
[ 0.055442] smpboot: CPU0: Intel Xeon E3-12xx v2 (Ivy Bridge) (fam: 06, model: 3a, stepping: 09)
[ 0.056000] Performance Events: unsupported p6 CPU model 58 no PMU driver, software events only.
[ 0.057602] x86: Booting SMP configuration:
[ 0.058686] .... node #0, CPUs: #1
[ 0.008000] kvm-clock: cpu 1, msr 0:1ffd6041, secondary cpu clock
...
So, the perf command could NOT work hardware PMU event in guest OS.
How could I enable hardware PMU from my host to the Ubuntu guest?
Thanks,
-Tao

Page https://github.com/mozilla/rr/wiki/Building-And-Installing gives some hints how to enable guest PMU:
Qemu: On QEMU command line use
-cpu host
Libvirt/KVM: Specify CPU passthrough in domain XML definition:
<cpu mode='host-passthrough'/>
Same advice in https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/virtualization_tuning_and_optimization_guide/sect-virtualization_tuning_optimization_guide-monitoring_tools-vpmu
I edit <cpu mode='host-passthrough'/> line into /etc/libvirt/qemu/my_vm_name.xml file instead of <cpu>...</cpu> block.
(In virt-manager use "host-passthrough" as CPU "Model:" field - http://blog.wikichoon.com/2016/01/using-cpu-host-passthrough-with-virt.html)
Now PMU works, tested with perf stat echo inside the VM, there are "arch_perfmon" in /proc/cpuinfo and PMUs are enabled in dmesg|grep PMU.
-cpu host option of Qemu was used according to /var/log/libvirt/qemu/vm_name.log:
/usr/bin/kvm-spice ... -machine ...,accel=kvm,... -cpu host ...

Related

How to add VFIO-IOMMU in KVM virtual machine (Aarch64)?

I am using aarch64 Linux to test VFIO-IOMMU feature in KVM VM.
The host is cortex-A78 running Linux-5.10.104 (with VFIO_IOMMU enabled). The guest OS is Ubuntu-22.04 (Linux-5.15, also with VFIO_IOMMU enabled).
The VM is created with virt-manager with virtio devices, like NIC, SCSI, etc.
But I did not find the way to add VFIO-IOMMU device to the VM in internet.
I tried by adding following lines into the vm.xml,
<iommu model='smmuv3'/>
But after guest OS boot, I found following logs about iommu but nothing about SMMUv3.
t#t:~$ dmesg | grep -i mmu
[ 0.320696] iommu: Default domain type: Translated
[ 0.321218] iommu: DMA domain TLB invalidation policy: strict mode
So how can VFIO-IOMMU be supported/added to the VM in this case?
The qemu-system-aarch64 is 4.2.1, I am not sure if it could support smmuv4 for ARMv8
I confirmed that QEMU-6.2.0 supports SMMUv3. The guest OS log shows something as follows,
[ 0.578157] arm-smmu-v3 arm-smmu-v3.0.auto: option mask 0x0
[ 0.578841] arm-smmu-v3 arm-smmu-v3.0.auto: ias 44-bit, oas 44-bit (features 0x00008305)
[ 0.580289] arm-smmu-v3 arm-smmu-v3.0.auto: allocated 65536 entries for cmdq
[ 0.581060] arm-smmu-v3 arm-smmu-v3.0.auto: allocated 128 entries for evtq

Using qemu+kvm is slower than using qemu in rv6(xv6 rust porting)

We are currently working on the rv6 project which is porting MIT's educational operating system xv6 to Rust. Our code is located here.
We use qemu and qemu's virt platform to execute rv6, and it works well with using qemu.
Executing command on arm machine is this:
RUST_MODE=release TARGET=arm KVM=yes GIC_VERSION=3
qemu-system-aarch64 -machine virt -kernel kernel/kernel -m 128M -smp 80 -nographic -drive file=fs.img,if=none,format=raw,id=x0,copy-on-read=off -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -cpu cortex-a53 -machine gic-version=3 -net none
To make some speed boost experiment with KVM, we made rv6 support the arm architecture on arm machine. The arm architecture's driver code locates in here.
The problem is, when we use qemu with kvm, the performance is significantly reduced.
Executing command on arm machine with KVM is this:
qemu-system-aarch64 -machine virt -kernel kernel/kernel -m 128M -smp 80 -nographic -drive file=fs.img,if=none,format=raw,id=x0,copy-on-read=off -device virtio-blk-device,drive=x0,bus=virtio-mmio-bus.0 -cpu host -enable-kvm -machine gic-version=3 -net none
We repeated
Write 500 bytes syscall 10,000 times and the result was: kvm disable: 4,500,000 us, kvm enable: 29,000,000 us. (> 5 times)
Open/Close syscall 10,000 times result: kvm disable: 12,000,000 us, kvm enable: 29,000,000 us. (> 5 times)
Getppid syscall 10,000 times result: kvm disable: 735,000 us, kvm enable: 825,000 us. (almost same)
Simple calculation(a = a * 1664525 + 1013904223) 100 million times result: kvm disable: 2,800,000 us, kvm enable: 65,000,000 us. (> 20 times)
And the elapsed time was estimated by uptime_as_micro syscall in rv6.
These results were so hard to understand. So first we tried to find the bottleneck on rv6's booting process, because finding bottleneck during processing user program was so difficult.
We found that the first noticeable bottleneck on rv6 booting process was here:
run.as_mut().init();
self.runs().push_front(run.as_ref());
As far as we know, this part is just kind of "list initialization and push element" part. So we thought that by some reason, the KVM is not actually working and it makes worse result. And also this part is even before turn on some interrupts, so we thought arm's GIC or interrupt related thing is not related with problem.
So, how can I get better performance when using kvm with qemu?
To solve this problem, we tried these already:
change qemu(4.2, 6.2), virt version, change some command for qemu-kvm like cpu, drive cache, copy-on-read something, kernel_irqchip.., cpu core.. etc
find some kvm hypercall to use - but not exists on arm64
Run lmbench by ubuntu on qemu with kvm to check KVM itself is okay. - We found KVM with ubuntu is super faster than only using qemu.
Check 16550a UART print code is really slow on enabling KVM which makes incorrect result on benchmark - Without bottleneck code, we found the progress time of rv6 booting were almost same with KVM enabled or not.
Check other people who suffer same situation like us - but this superuser page not works. Our clocksource is arch_sys_counter.
Our Environment
qemu-system-aarch64 version: 4.2.1 (Debian 1:4.2-3ubuntu6.19)
CPU model: Neoverse-N1
Architecture: arrch64
CPU op-mode(s): 32-bit, 64-bit
CPU(s): 80
Ubuntu 20.04.3 LTS(Focal Fossa)
/dev/kvm exists
The main reason is here: https://forum.osdev.org/viewtopic.php?f=1&t=56120
RV6's memory setting was incorrect.
Change ARM's device memory to cacheable memory for MAIR_EL1 ARM register solved the problem.

Glitch Booting Linux with Spike?

I used Spike to boot linux using the riscv tools but the linux boot sequence seems to stop at Bootconsole[early0] disabled.
I tried adding kernel command line root=/dev/vda ro console=ttyS0 but didn't work. The same console settings works in QEMU. Also checked the .config file for the line CONFIG_HVC_RISCV_SBI=y. It was there. Still coulnd't get past it.
Tried with Linux kernel version 4.19 to 5.2. No luck. Am I doing something wrong here?
Steps I followed:
Compiled linux with Riscv toolchain
compiled riscv-pk with ../configure --host=riscv64-unknown-elf --with-payload= [path to vmlinux]
used "Spike bbl" to start spike image.
Please let me know if any more info is required.
Sorry, noob here.
Attaching terminal output
bbl loader
OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
Linux version 4.19.59 (root#AsusFX504) (gcc version 8.2.0 (GCC)) #2 SMP Sat Jul 20 05:11:32 IST 2019
bootconsole [early0] enabled
initrd not found or empty - disabling initrd
Zone ranges:
DMA32 [mem 0x0000000080200000-0x00000000ffffffff]
Normal empty
Movable zone start for each node
Early memory node ranges
node 0: [mem 0x0000000080200000-0x00000000ffffffff]
Initmem setup node 0 [mem 0x0000000080200000-0x00000000ffffffff]
software IO TLB: mapped [mem 0xfa3fe000-0xfe3fe000] (64MB)
elf_hwcap is 0x112d
percpu: Embedded 17 pages/cpu s29912 r8192 d31528 u69632
Built 1 zonelists, mobility grouping on. Total pages: 516615
Kernel command line:
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Sorting __ex_table...
Memory: 1988760K/2095104K available (5468K kernel code, 329K rwdata, 1751K rodata, 193K init, 806K bss, 106344K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
rcu: Hierarchical RCU implementation.
rcu: RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x24e6a1710, max_idle_ns: 440795202120 ns
Console: colour dummy device 80x25
console [tty0] enabled
bootconsole [early0] disabled
This might be because Virtual Terminal is enabled in your linux config. Disabling Virtual Terminal might solve your issue.
In Linux make menuconfig go to :-
Location:
-> Device Drivers
-> Character devices
and Disble Virtual terminal .
Symbol: VT [=y] n
Type : bool
Prompt: Virtual terminal
As mentioned previously, the reason is that VT is enabled so kernel has a dummy VT framebuffer that simply goes nowhere, but you don't have to disable it. console=ttyS0 will also not work, since SPIKE doesn't emulate it. Note that this won't work on the HiFive Unleashed, either, since serial terminals there are ttySIF0 and ttySIF1. What you want is console=hvc0 and SPIKE will be able to continue from there, e.g. in your kernel .config:
CONFIG_CMDLINE_BOOL=y
CONFIG_CMDLINE="earlyprintk console=hvc0"
CONFIG_CMDLINE_FORCE=y

Error running qemu-system-riscv using root.bin and vmlinux

I am following riscv.org guides for toolchain building. When emulate using qemu running local built rootfilesystem (with busybox) and Linux Kernel, encounter the error below:
Running Qemu using local-built root.bin and kernel image
danny#danny:~/test/riscv/work$ qemu-system-riscv -hda root-local.bin -kernel vmlinux-local -nographic
unassigned address was called?
with addr: 102000735F80006E
not implemented for riscv
Running Qemu using riscv.org stocked root.bin and kernel image
danny#danny:~/test/riscv/work$ qemu-system-riscv -hda root.bin -kernel vmlinux -nographic
[ 0.150000] io scheduler cfq registered (default)
[ 0.160000] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.160000] serial8250: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 0.160000] TCP: cubic registered
[ 0.160000] htifbd: detected disk with ID 1
[ 0.160000] htifbd: adding htifbd0
[ 0.160000] VFS: Mounted root (ext2 filesystem) readonly on device 254:0.
[ 0.160000] devtmpfs: mounted
[ 0.160000] Freeing unused kernel memory: 64K (ffffffff80002000 - ffffffff80012000)
[ 0.200000] EXT2-fs (htifbd0): warning: mounting unchecked fs, running e2fsck is recommended
#uname -a
Linux ucbvax 3.14.15-g4073e84-dirty #4 Sun Jan 11 07:17:06 PST 2015 riscv GNU/Linux
If qemu testing using the downloaded root.bin and vmlinux from riscv.org, seem ok but cant see the busybox starting message and the terminal cant Halt :
Have tested qemu using various combination and result as below:
**root.bin vmlinux RESULT**
local-built local-built Unassigned address was called ....
Downloaded Downloaded Seem OK but without busybox starting bar
local-built Downloaded Kernelpanic-not syncing:No working init found
Downloaded local-built Unassigned address was called ....
We are starting a project to build and fabricate a RISCV silicon chip for Makers around the world and testing the toolchain now in order to port Ubuntu Core & Android to RISCV. Any idea what might probably went wrong ?
Thanks.
QEMU hasn't been fully updated to support the new RISC-V privileged spec (github issue). The update is currently underway.
For an ISA simulator, spike is a good alternative. It may not have all of the platform features of QEMU, but it could serve as a starting point while the QEMU update completes.

Is there an OS command I can run to determine if running inside a Xen based virtual machine

Is there an OS command I can run from within a Xen based virtual machine to tell me that it is a virtual box rather than a physical box - I heard that the kernel had some self awareness smarts about it. e.g. like an extra column in "ps" output or something? [I know vmstat provides the "st" column but I have seen this on physical host boxes running Linux Kernel 2.6.11 and greater as well].
Many Thanks,
Paul
Try file /sys/hypervisor/uuid.
It does not exist -> Not related to XEN.
It does exist, and is full of 0-s -> It is a XEN Dom0
It does exist, and has a not-0 values -> It is a DomU
This requires of course, that /sys is mounted and populated...
Dmesg may give some hints from the kernel message buffer, here is output on a virtualized Ubuntu instance from Slicehost:
bvm#qdbp:~$ sudo dmesg | grep Xen
[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
[ 0.000000] Xen: 0000000000100000 - 0000000010000000 (usable)
[ 0.000000] Booting paravirtualized kernel on Xen
[ 0.000000] Xen version: 3.1.2-rc1
[ 0.000000] Xen: using vcpu_info placement
[ 0.000000] Xen: using vcpuop timer interface
[ 0.000000] installing Xen timer for CPU 0
[ 0.021223] installing Xen timer for CPU 1
[ 0.046157] installing Xen timer for CPU 2
[ 0.046157] installing Xen timer for CPU 3
[ 0.265880] Initialising Xen virtual ethernet driver.

Resources