virt-manager install aborting due to memory - linux

Running Linux 5.16.10-arch1-1 #1 SMP PREEMPT Wed, 16 Feb 2022 19:35:18
Been trying my hand at setting up a Windows 10 virtual machine through virt-manager utilizing QEMU and KVM, and I've run into quite a fair bit of trouble in the process. This error message appears when clicking "Begin Installation":
internal error: qemu unexpectedly closed the monitor: 2022-02-21T14:05:58.222051Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T14:05:58.222249Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T14:05:58.222327Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:01:00.0: failed to setup container for group 1: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x55ce3e4e7d80, 0x0, 0x80000000, 0x7f46abe00000) = -12 (Cannot allocate memory)
I'm not particularly experienced with Linux or its derivatives (I'm starting with EndeavourOS, which I freshly installed yesterday) but I am somewhat tech savvy, so I have made several attempts to fix this. However, all of the similar issues and solutions I found online after running out of ideas provided me no reprieve.
This does also include frequently reading, re-reading and testing concepts provided by the Arch Linux wiki itself. I've ensured that IOMMU is properly enabled and configured. My GPU (gtx 1080) is completely isolated and has been bound to vfio-pci, libvirt has been configured, access permission for /dev/vfio/1 has been granted as needed, I've edited /etc/pam.d/sudo and changed permissions for /etc/libvirt/qemu.conf. Unless I am hilariously misinformed, none of that is the cause of this issue.
I've been largely unsuccessful in finding and fixing the cause of this error message. I have changed hard and soft ram limits and tried many different values for memory allocation (2048, 4096, 8192, 12288) and none have made any difference in the error message.
I don't know what logs or outputs would be beneficial to attach, so please feel free to request any that you would need and I will provide them to the best of my ability.
Edit 1
Realized it would probably be a good idea to post the entire error log. Here it is:
Unable to complete install: 'internal error: qemu unexpectedly closed the monitor: 2022-02-21T19:36:08.149009Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T19:36:08.150079Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T19:36:08.150300Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:01:00.0: failed to setup container for group 1: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x55d714ea5a30, 0x0, 0x80000000, 0x7fa233e00000) = -12 (Cannot allocate memory)'
Traceback (most recent call last):
File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in cb_wrapper
callback(asyncjob, *args, **kwargs)
File "/usr/share/virt-manager/virtManager/createvm.py", line 2001, in _do_async_install
installer.start_install(guest, meter=meter)
File "/usr/share/virt-manager/virtinst/install/installer.py", line 701, in start_install
domain = self._create_guest(
File "/usr/share/virt-manager/virtinst/install/installer.py", line 649, in _create_guest
domain = self.conn.createXML(install_xml or final_xml, 0)
File "/usr/lib/python3.10/site-packages/libvirt.py", line 4400, in createXML
raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: internal error: qemu unexpectedly closed the monitor: 2022-02-21T19:36:08.149009Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T19:36:08.150079Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: VFIO_MAP_DMA failed: Cannot allocate memory
2022-02-21T19:36:08.150300Z qemu-system-x86_64: -device vfio-pci,host=0000:01:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:01:00.0: failed to setup container for group 1: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x55d714ea5a30, 0x0, 0x80000000, 0x7fa233e00000) = -12 (Cannot allocate memory)

Related

Is setting the linux memory to unlimit will have an adverse effect?

I am running MPI job in linux server. I got error:
--------------------------------------------------------------------------
The OpenFabrics (openib) BTL failed to initialize while trying to
allocate some locked memory. This typically can indicate that the
memlock limits are set too low. For most HPC installations, the
memlock limits should be set to "unlimited". The failure occured
here:
Local host: yw0431
OMPI source: ../../../../../ompi/mca/btl/openib/btl_openib_component.c:1216
Function: ompi_free_list_init_ex_new()
Device: mlx4_0
Memlock limit: 65536
You may need to consult with your system administrator to get this
problem fixed. This FAQ entry on the Open MPI web site may also be
helpful:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: yw0431
Local device: mlx4_0
--------------------------------------------------------------------------
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / init-fail-no-mem
[yw0431:20193] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
[yw0431:20193] 11 more processes have sent help message help-mpi-btl-openib.txt / error in device init
forrtl: error (78): process killed (SIGTERM)
it means that my linux server have locked memory with 65M, but my job needed more memory. I think 2G should be emough.
I have found a solution about ulimiting the memory:
ulimit -l unlimited
But i am worried that i will cause system crash or some bad things happen.
so can i set "ulimit -l umlimited"?
When you set ulimit as unlimited and your process starting using memory exhaustively then OOM killer will kill ur job for system stability,I would set the ulimit as 80 to 90% of RAM of instead of unlimited.

blktrace output error in docker container

my docker container base image is ubuntu, and I ran it with full privilege options that means in run command I use these switches:
--cap-add=SYS_ADMIN --security-opt apparmor:unconfined
I wanna use blktrce using below command:
sudo blktrace -d / -a issue -o - | blkparse -f "%p %T.%9t %D %S ^C %d\n" -i - >stream.out
but, first time when is use this command I get this error:
Debugfs is not mounted at /sys/kernel/debug
i searched and found this solution which led to use this command:
mount -t debugfs none /sys/kernel/debug
after that, when I use the blktrace command again, i get this error:
BLKTRACESETUP(2) / failed: 25/Inappropriate ioctl for device
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory
Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or directory
Thread 4 failed open /sys/kernel/debug/block/(null)/trace4: 2/No such file or directory
Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or directory
Thread 5 failed open /sys/kernel/debug/block/(null)/trace5: 2/No such file or directory
Thread 7 failed open /sys/kernel/debug/block/(null)/trace7: 2/No such file or directory
Thread 8 failed open /sys/kernel/debug/block/(null)/trace8: 2/No such file or directory
Thread 6 failed open /sys/kernel/debug/block/(null)/trace6: 2/No such file or directory
Thread 12 failed open /sys/kernel/debug/block/(null)/trace12: 2/No such file or directory
Thread 10 failed open /sys/kernel/debug/block/(null)/trace10: 2/No such file or directory
Thread 13 failed open /sys/kernel/debug/block/(null)/trace13: 2/No such file or directory
Thread 15 failed open /sys/kernel/debug/block/(null)/trace15: 2/No such file or directory
Thread 14 failed open /sys/kernel/debug/block/(null)/trace14: 2/No such file or directory
Thread 17 failed open /sys/kernel/debug/block/(null)/trace17: 2/No such file or directory
Thread 16 failed open /sys/kernel/debug/block/(null)/trace16: 2/No such file or directory
Thread 18 failed open /sys/kernel/debug/block/(null)/trace18: 2/No such file or directory
Thread 11 failed open /sys/kernel/debug/block/(null)/trace11: 2/No such file or directory
Thread 19 failed open /sys/kernel/debug/block/(null)/trace19: 2/No such file or directory
Thread 20 failed open /sys/kernel/debug/block/(null)/trace20: 2/No such file or directory
Thread 9 failed open /sys/kernel/debug/block/(null)/trace9: 2/No such file or directory
Thread 21 failed open /sys/kernel/debug/block/(null)/trace21: 2/No such file or directory
Thread 22 failed open /sys/kernel/debug/block/(null)/trace22: 2/No such file or directory
Thread 23 failed open /sys/kernel/debug/block/(null)/trace23: 2/No such file or directory
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
FAILED to start thread on CPU 2: 1/Operation not permitted
FAILED to start thread on CPU 3: 1/Operation not permitted
FAILED to start thread on CPU 4: 1/Operation not permitted
FAILED to start thread on CPU 5: 1/Operation not permitted
FAILED to start thread on CPU 6: 1/Operation not permitted
FAILED to start thread on CPU 7: 1/Operation not permitted
FAILED to start thread on CPU 8: 1/Operation not permitted
FAILED to start thread on CPU 9: 1/Operation not permitted
FAILED to start thread on CPU 10: 1/Operation not permitted
FAILED to start thread on CPU 11: 1/Operation not permitted
FAILED to start thread on CPU 12: 1/Operation not permitted
FAILED to start thread on CPU 13: 1/Operation not permitted
FAILED to start thread on CPU 14: 1/Operation not permitted
FAILED to start thread on CPU 15: 1/Operation not permitted
FAILED to start thread on CPU 16: 1/Operation not permitted
FAILED to start thread on CPU 17: 1/Operation not permitted
FAILED to start thread on CPU 18: 1/Operation not permitted
FAILED to start thread on CPU 19: 1/Operation not permitted
FAILED to start thread on CPU 20: 1/Operation not permitted
FAILED to start thread on CPU 21: 1/Operation not permitted
FAILED to start thread on CPU 22: 1/Operation not permitted
FAILED to start thread on CPU 23: 1/Operation not permitted
how can i solve that?
Update1:
there is sda folder in /sys/kernel/debug/block/ and into this folder there are these files:
trace0 trace1 trace2 etc.
Update2:
#abligh thank for your answer, but did not help. The strace output about ioctl is:
ioctl(3, BLKTRACESETUP, {act_mask=64, buf_size=524288, buf_nr=4, start_lba=0, end_lba=0, pid=0}, 0x7ffe8a4ceac0) = -1 ENOTTY (Inappropriate ioctl for device)
write(2, "BLKTRACESETUP(2) / failed: 25/Inappropriate ioctl for device\n", 61) = 61
ioctl(3, BLKTRACESTOP, 0x7f6fd19789d0) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(3, BLKTRACESTOP, 0x7ffe8a4ce540) = -1 ENOTTY (Inappropriate ioctl for device)
ioctl(3, BLKTRACESTOP, 0x7f6fd19789e0) = -1 ENOTTY (Inappropriate ioctl for device)`
ioctl(3, BLKTRACETEARDOWN, 0x7f6fd19789e0) = -1 ENOTTY (Inappropriate ioctl for device)
to answer your question about why i run blktrace in container, i would say I'm using containers as cluster, so i need the trace of each node.
I don't have enough information to debug this, but you asked for an answer to be posted.
The root of the problem seems to be here:
BLKTRACESETUP(2) / failed: 25/Inappropriate ioctl for device
This is indicating that the ioctl call to set up block tracing is failing. It is failing inside the container, but according to the comments works outside the container. This would indicate that the problem is with the container setup or a limitation in the kernel that prevents that ioctl from being used in a container at all.
The error inappropriate ioctl for device is errno 25, i.e. ENOTTY. I can't immediately see what would be returning that unless it can't find the device node at all (given you've already demonstrated from the fact it works outside the container that the block trace code is compiled in). I can't remember whether this is in a module, but it would be worth trying block tracing outside the container first (then check it inside the container), just to check this isn't a module loading issue.
The first step in debugging this would be to use strace tool (as suggested above) so you know exactly which system call is being made with what parameters. EG run:
sudo strace -f -s2048 -o/tmp/trace blktrace -d / -a issue -o - | blkparse -f "%p %T.%9t %D %S ^C %d\n" -i - >stream.out
and look at /tmp/trace afterwards. strace will list all the system calls made. See if you can determine from that which ioctl is failing.
Secondly, I'd ensure that the block device that you are trying to trace actually appears within the correct place(s) in /proc/ and /sys/. Something wrong is happening here:
Thread 0 failed open /sys/kernel/debug/block/(null)/trace0: 2/No such file or directory
Note the (null) in the debug line, which clearly should not be there - that should be the name of the block device. This is possibly a consequence of the failed ioctl, or possibly indicative of a problem within the /sys/ hierarchy.
BLKTRACESETUP is handled in the kernel here. This eventually calls doblktracesetup here. I cannot immediately see any reason why this would not work from an appropriately permissioned container, which makes me suspect your /sys hierarchy might not be set up right. But the output of strace would be helpful.
Also, the inevitable comment: why are you doing this in a container? Why not run it outside the container?
EDIT: Looks like it might be a kernel thing. See https://github.com/scaleway/kernel-tools/issues/107 - this suggests you (a) need to modprobe the specific modules first, and (b) may need a specific kernel.

Spark - UbuntuVM - insufficient memory for the Java Runtime Environment

I'm trying to install Spark1.5.1 on Ubuntu14.04 VM. After un-tarring the file, I changed the directory to the extracted folder and executed the command "./bin/pyspark" which should fire up the pyspark shell. But I got an error message as follows:
[ OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000c5550000, 715849728, 0) failed;
error='Cannot allocate memory' (errno=12) There is insufficient
memory for the Java Runtime Environment to continue.
Native memory allocation (malloc) failed to allocate 715849728 bytes
for committing reserved memory.
An error report file with more information is saved as:
/home/datascience/spark-1.5.1-bin-hadoop2.6/hs_err_pid2750.log ]
Could anyone please give me some directions to sort out the problem?
We need to set spark.executor.memory in conf/spark-defaults.conf file to a value specific to your machine. For example,
usr1#host:~/spark-1.6.1$ cp conf/spark-defaults.conf.template conf/spark-defaults.conf
nano conf/spark-defaults.conf
spark.driver.memory 512m
For more information, refer to the official documentation: http://spark.apache.org/docs/latest/configuration.html
Pretty much what it says. It wants 7GB of RAM. So give the VM ~ 8GB of RAM.

Out of memory errors when executing Jest on Ubuntu

I'm trying to execute Jest on Ubuntu 14.04.02, in a virtual machine with 4gb of RAM. node version 0.12.2, npm 2.0.0-alpha-5
free shows me:
total used free shared buffers cached
Mem: 3.8G 199M 3.6G 976K 1.1M 18M
When I run npm test, I keep getting a variety of out of memory errors:
Error: FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
FATAL ERROR: Committing semi space failed. Allocation failed - process out of memory
# Fatal error in ../deps/v8/src/heap/store-buffer.cc, line 132
# CHECK(old_virtual_memory_->Commit(reinterpret_cast<void*>(old_limit_), grow * kPointerSize, false)) failed
Any idea what the minimum memory requirement is...or if I have misconfiguration something that is leading to this?
It turns out downgrading to node version 0.10.32, installed via npm, healed the issue.

Can't compile Wine under Windows

I need to compile wine dlls under windows for debug purposes.
I installed cygwin, downloaded wine and run "./configure", then following error appears. I am completely new to Linux environment, so I can even understand what it does mean.
./configure: fork: retry: Resource temporarily unavailable
261414951 [main] sh 1368 fhandler_dev_zero::fixup_mmap_after_fork: requested 0x7E6E0000 != 0x0 mem alloc base 0x0, state 0x10000, size 524288, Win32 error 487
261415207 [main] sh 1368 C:\cygwin\bin\sh.exe: *** fatal error in forked process - recreate_mmaps_after_fork_failed
261415593 [main] sh 1368 open_stackdumpfile: Dumping stack trace to sh.exe.stackdump
As per Cygwin wiki, errors related to fork() are usually solved by rebasing.
Stop all Cygwin services; open a command prompt & issue:
\cygwin\bin\dash -c '/usr/bin/rebaseall'

Resources