SLURM, using srun to print outputs - slurm

I am using srun to run my program, however, it cannot print the output.
me#home:~$ srun -p K80q --gres=gpu:1 -N 1 python3 main.py
2019-05-15 19:56:43.305156: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-05-15 19:56:43.543516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1392] Found device 0 with properties:
name: Tesla K80 major: 3 minor: 7 memoryClockRate(GHz): 0.8235
pciBusID: 0000:85:00.0
totalMemory: 11.17GiB freeMemory: 11.10GiB
2019-05-15 19:56:43.543567: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1471] Adding visible gpu devices: 0
2019-05-15 19:56:43.900189: I tensorflow/core/common_runtime/gpu/gpu_device.cc:952] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-05-15 19:56:43.900248: I tensorflow/core/common_runtime/gpu/gpu_device.cc:958] 0
2019-05-15 19:56:43.900257: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: N
2019-05-15 19:56:43.900619: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1084] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10761 MB memory) -> physical GPU (device: 0, name: Tesla K80, pci bus id: 0000:85:00.0, compute capability: 3.7)
I only got the above output and it cannot print the information I expected. How can I fix it?
By the way, simply define a test code
import tensorflow
if __name__ == '__main__':
for i in range(10):
print('Hello')
It can print Hello 10 times.
Update:
After 20 minutes, it outputs some information I expected. How can I make it output immediately?

Try the -u option of srun:
-u, --unbuffered
By default the connection between slurmstepd and the user launched application is over a pipe. The stdio output written by
the application is
buffered by the glibc until it is flushed or the output is set as unbuffered. See setbuf(3). If this option is specified the
tasks are executed
with a pseudo terminal so that the application output is unbuffered.

Related

Kernel modules not loaded during boot

Observing that some kernel modules are not being loaded in the latest kernel 5.15.34-v7.
So I have built a core-image-base from meta-raspberrypi (0135a02) and while trying access the camera using Picamera got some errors. The errors mainly complain about mmal drivers not present.
root#raspberrypi3:~# python3
Python 3.10.4 (main, Mar 23 2022, 20:25:24) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from picamera import PiCamera
>>> camera = PiCamera()
mmal: mmal_vc_shm_init: could not initialize vc shared memory service
mmal: mmal_vc_component_create: failed to initialise shm for 'vc.camera_info' (7:EIO)
mmal: mmal_component_create_core: could not create component 'vc.camera_info' (7)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.10/site-packages/picamera/camera.py", line 408, in __init__
self._init_revision(options)
File "/usr/lib/python3.10/site-packages/picamera/camera.py", line 480, in _init_revision
with mo.MMALCameraInfo() as camera_info:
File "/usr/lib/python3.10/site-packages/picamera/mmalobj.py", line 2425, in __init__
super(MMALCameraInfo, self).__init__()
File "/usr/lib/python3.10/site-packages/picamera/mmalobj.py", line 696, in __init__
mmal_check(
File "/usr/lib/python3.10/site-packages/picamera/exc.py", line 184, in mmal_check
raise PiCameraMMALError(status, prefix)
picamera.exc.PiCameraMMALError: Failed to create MMAL component b'vc.camera_info': I/O error
>>>
>>>
root#raspberrypi3:~#
After digging through my system found an older build (don't know why I didn't delete it but thankfully it gave some insight into the issue), I tried booting that image and everything seems to be working fine.
So I checked out to the commit which the older build was using (63a3d8cb17c5d1affe8f2848f45fcc6a706f9412), and the camera worked fine(though I had to make few changes, which are not significant for this issue). While analyzing the bootlogs found that the latest build (0135a02) doesn't load all the drivers.
Also I have observed that the kernel module are compressed in the 5.15.34 kernel, eg: root#raspberrypi3:~# ls /lib/modules/5.15.34-v7/kernel/drivers/usb/gadget/libcomposite.ko.xz and while trying load the modules using modprobe getting the following error:
root#raspberrypi3:~# ls /lib/modules/5.15.34-v7/kernel/drivers/usb/gadget/legacy/
g_acm_ms.ko.xz g_cdc.ko.xz g_hid.ko.xz g_midi.ko.xz g_printer.ko.xz g_webcam.ko.xz gadgetfs.ko.xz
g_audio.ko.xz g_ether.ko.xz g_mass_storage.ko.xz g_multi.ko.xz g_serial.ko.xz g_zero.ko.xz
root#raspberrypi3:~# modprobe gadgetfs
modprobe: FATAL: Module gadgetfs not found in directory /lib/modules/5.15.34-v7
My question is what and where the changes have happened to the kernel between 63a3d8cb17c5d1affe8f2848f45fcc6a706f9412 (5.10) and 0135a02 (5.15) , so that I can look into and adapt the changes required ?
Note: All the commit hashes which are mentioned above are of meta-raspberrypi repo.
Logs
lsmod logs
5.15.34
root#raspberrypi3:~# lsmod
Module Size Used by
root#raspberrypi3:~#
5.10.81
root#raspberrypi3:~# lsmod
Module Size Used by
rfcomm 49152 2
cmac 16384 3
algif_hash 16384 1
nfc 86016 0
aes_arm_bs 24576 2
crypto_simd 16384 1 aes_arm_bs
cryptd 24576 2 crypto_simd
algif_skcipher 16384 1
af_alg 28672 6 algif_hash,algif_skcipher
bnep 20480 2
hci_uart 40960 1
btbcm 16384 1 hci_uart
bluetooth 421888 31 hci_uart,bnep,btbcm,rfcomm
ecdh_generic 16384 2 bluetooth
ecc 36864 1 ecdh_generic
ipv6 503808 26
brcmfmac 331776 0
brcmutil 24576 1 brcmfmac
sha256_generic 16384 0
bcm2835_v4l2 49152 0
cfg80211 782336 1 brcmfmac
bcm2835_codec 40960 0
bcm2835_isp 32768 0
v4l2_mem2mem 36864 1 bcm2835_codec
rfkill 32768 4 bluetooth,nfc,cfg80211
bcm2835_mmal_vchiq 36864 3 bcm2835_isp,bcm2835_codec,bcm2835_v4l2
videobuf2_dma_contig 20480 2 bcm2835_isp,bcm2835_codec
videobuf2_vmalloc 16384 1 bcm2835_v4l2
videobuf2_memops 16384 2 videobuf2_dma_contig,videobuf2_vmalloc
videobuf2_v4l2 32768 4 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem
videobuf2_common 61440 5 bcm2835_isp,bcm2835_codec,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
raspberrypi_hwmon 16384 0
videodev 253952 6 bcm2835_isp,bcm2835_codec,videobuf2_common,bcm2835_v4l2,v4l2_mem2mem,videobuf2_v4l2
mc 45056 6 bcm2835_isp,bcm2835_codec,videobuf2_common,videodev,v4l2_mem2mem,videobuf2_v4l2
vc_sm_cma 32768 2 bcm2835_isp,bcm2835_mmal_vchiq
uio_pdrv_genirq 16384 0
uio 20480 1 uio_pdrv_genirq
fixed 16384 0
root#raspberrypi3:~#
Make sure you have kernel-modules installed:
IMAGE_INSTALL_append = " kernel-modules"
EDIT
The package that provides all kernel modules is kernel-modules, or each modules is within a separate package kernel-module-<module_name>. For meta-raspberrypi, they set kernel-modules as a package not essential for boot, means that if the package is not found, the board should boot normal:
meta-raspberrypi/conf/machine/include/rpi-base.inc
MACHINE_EXTRA_RRECOMMENDS += "kernel-modules udev-rules-rpi"
In previous meta-raspberrypi branches, it was a rpi image recipe rpi-basic-image.bb:
# Base this image on core-image-minimal
include recipes-core/images/core-image-minimal.bb
# Include modules in rootfs
IMAGE_INSTALL += " \
kernel-modules \
"
SPLASH = "psplash-raspberrypi"
IMAGE_FEATURES += "ssh-server-dropbear splash"
do_image:prepend() {
bb.warn("The image 'rpi-basic-image' is deprecated, please use 'core-image-base' instead")
}
So, the only thing needed for integrating the kernel modules is kernel-modules package, either by the image example, or try:
local.conf
MACHINE_EXTRA_RRECOMMENDS_remove = "kernel-modules"

Cannot dlopen some GPU libraries. Skipping registering GPU devices

Tensorflow is only using the CPU and wont use the GPU. I assume its because it expects Cuda 10.0 and it finds 10.2.
I had installed 10.2 but have purged it and installed 10.0.
Im running Ubuntu 19.10, AMD Ryzen 2700 Cpu, RTX 2080 S.
I have installed the 440 Nvidia driver, It says cuda version 10.2 when i check with nvidia-smi and nvcc -version.
From pip3: tensorflow-gpu 1.14.0
tensorflow-datasets 2.0.0
tensorflow-estimator 1.14.0
tensorflow-metadata 0.21.1
From Nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44 Driver Version: 440.44 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:08:00.0 On | N/A |
| 0% 48C P8 13W / 250W | 369MiB / 7979MiB | 3% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1110 G /usr/lib/xorg/Xorg 18MiB |
| 0 1611 G /usr/lib/xorg/Xorg 73MiB |
| 0 1816 G /usr/bin/gnome-shell 108MiB |
| 0 2655 C python3 115MiB |
+-----------------------------------------------------------------------------+
from nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Oct_23_19:24:38_PDT_2019
Cuda compilation tools, release 10.2, V10.2.89
But when i check the version.txt i get 10.0.130
cat /usr/local/cuda/version.txt
CUDA Version 10.0.130
I check the devices with :
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
result:
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 4810338588393992961
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 7271419476897292826
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 4332706623198547606
physical_device_desc: "device: XLA_GPU device"
]
How do i register the 10.0.130
Is that the reason why it wont run on GPU? Its super slow on the 8 Core CPU. Any advice?
Here is the log:
2020-02-13 14:11:31.411277: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-02-13 14:11:31.440150: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3193485000 Hz
2020-02-13 14:11:31.441076: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5625b689c790 executing computations on platform Host. Devices:
2020-02-13 14:11:31.441123: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2020-02-13 14:11:31.443001: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-02-13 14:11:31.472935: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-13 14:11:31.473407: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.845
pciBusID: 0000:08:00.0
2020-02-13 14:11:31.474361: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-02-13 14:11:31.487124: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-02-13 14:11:31.496148: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-02-13 14:11:31.498873: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-02-13 14:11:31.514842: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-02-13 14:11:31.525992: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-02-13 14:11:31.526168: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2020-02-13 14:11:31.526183: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2020-02-13 14:11:31.618627: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-13 14:11:31.618655: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2020-02-13 14:11:31.618662: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2020-02-13 14:11:31.620367: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-13 14:11:31.621395: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5625b732d5f0 executing computations on platform CUDA. Devices:
2020-02-13 14:11:31.621407: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2080 SUPER, Compute Capability 7.5
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 13330791690361361129
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 11872341970779952422
physical_device_desc: "device: XLA_CPU device"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 15007819717683015571
physical_device_desc: "device: XLA_GPU device"
]
WARNING:tensorflow:From pokeGAN.py:172: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.
WARNING:tensorflow:From pokeGAN.py:174: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From pokeGAN.py:77: The name tf.get_variable is deprecated. Please use tf.compat.v1.get_variable instead.
2020-02-13 14:11:33.799163: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-13 14:11:33.799597: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties:
name: GeForce RTX 2080 SUPER major: 7 minor: 5 memoryClockRate(GHz): 1.845
pciBusID: 0000:08:00.0
2020-02-13 14:11:33.799646: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-02-13 14:11:33.799658: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2020-02-13 14:11:33.799669: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2020-02-13 14:11:33.799684: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2020-02-13 14:11:33.799695: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2020-02-13 14:11:33.799706: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2020-02-13 14:11:33.799777: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
2020-02-13 14:11:33.799786: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1663] Cannot dlopen some GPU libraries. Skipping registering GPU devices...
2020-02-13 14:11:33.800016: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-13 14:11:33.800028: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]
WARNING:tensorflow:From pokeGAN.py:203: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.
2020-02-13 14:11:34.197990: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
WARNING:tensorflow:From /home/node/.local/lib/python3.7/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
WARNING:tensorflow:From pokeGAN.py:211: start_queue_runners (from tensorflow.python.training.queue_runner_impl) is deprecated and will be removed in a future version.
Instructions for updating:
To construct input pipelines, use the `tf.data` module.
total training sample num:91
batch size: 64, batch num per epoch: 1, epoch num: 5000
start training...
Judging from your logs it looks like tensorflow finds the correct cuda version but the cudnn library is missing.
2020-02-13 14:11:31.474361: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2020-02-13 14:11:31.526168: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Could not dlopen library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/cuda-10.2/lib64
Have you installed the correct version of cudnn? As you can see here
tensorflow 1.14 also requires cudnn 7.4
The only thing that worked for me to solve this issue was to completely remove CUDA and reinstall it again.

intel SPDK ioat example fail to run

I am new in the intel SPDK and meet some problem when I run the example code.
I setup the BIOS as this page said.
Intel® Hyper-Threading Technology off
Intel SpeedStep® technology enabled
Intel® Turbo Boost Technology disabled
then I git clone from this page and run all the command. The test command ./test/unit/unittest.sh return All unit tests passed.
But when I run the example examples/ioat/verify/verify , it return
EAL: 24 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found for that size
Starting SPDK v18.10-pre / DPDK 18.05.0 initialization...
[ DPDK EAL parameters: verify --no-shconf -c 0x1 --legacy-mem --file-prefix=spdk_pid3170 ]
EAL: Detected 16 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/spdk_pid3170/mp_socket
EAL: 24 hugepages of size 1073741824 reserved, but no mounted hugetlbfs found
for that size
EAL: Probing VFIO support...
User configuration:
Run time: 10 seconds
Core mask: 0x1
Queue depth: 32
Not enough ioat channels found. Check that ioat channels are bound
to uio_pci_generic or vfio-pci. scripts/setup.sh can help with this.
and scripts/setup.sh status shows
Hugepages
node hugesize free / total
node0 1048576kB 24 / 24
node0 2048kB 0 / 800
node1 1048576kB 0 / 0
node1 2048kB 0 / 224
NVMe devices
BDF Numa Node Driver name Device name
I/OAT DMA
BDF Numa Node Driver Name
virtio
BDF Numa Node Driver Name Device Name
My hardware is:
linux kernel version 4.15.7
with ioatdma compile as module
CPU intel Xeon E5-2695
chipset C612
It would be great help if somebody could give me some advises or send me some website about SPDK!
Thank you!
Run ./scripts/setup.sh (with no parameters). If there will be no ioat devices under I/OAT DMA section you can't run this app. Also there is no hugetlbfs mount points.

DPDK test application cannot found on redhat

I met a issue with error shows(when deploy dpdk on redhat) :
sudo: x86_64-native-linuxapp-gcc/app/test: command not found
I am not sure what is the matter.
Now I cannot test dpdk, could you someone help me if you met this before.
There are some detailed information about my system below.
FYI.
Kernel version
3.10.0-693.11.1.el7.x86_64
[root#cnhzdhcp16557 usertools]# ./dpdk-setup.sh
Build x86_64-native-linuxapp-gcc
...
== Build app/test-crypto-perf
== Build app/test-eventdev
Build complete [x86_64-native-linuxapp-gcc]
Installation cannot run with T defined and DESTDIR undefined
Insert IGB UIO module
Unloading any existing DPDK UIO module
Loading DPDK UIO module
Insert VFIO module
Unloading any existing VFIO module
Loading VFIO module
chmod /dev/vfio
OK
Insert KNI module
Unloading any existing DPDK KNI module
Loading DPDK KNI module
Press enter to continue ...
Network devices using kernel driver
0000:00:19.0 'Ethernet Connection I217-V 153b' if=enp0s25 drv=e1000e unused=igb_uio Active
0000:02:00.0 'Centrino Advanced-N 6235 088e' if=wlo1 drv=iwlwifi unused=igb_uio
Huge page information
AnonHugePages: 98304 kB
HugePages_Total: 128
HugePages_Free: 128
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Run test application ($RTE_TARGET/app/test)
Enter hex bitmask of cores to execute test app on
Example: to execute app on cores 0 to 7, enter 0xff
bitmask: f
Launching app
sudo: x86_64-native-linuxapp-gcc/app/test: command not found
Run testpmd application in interactive mode ($RTE_TARGET/app/testpmd)
Enter hex bitmask of cores to execute test app on
Example: to execute app on cores 0 to 7, enter 0xff
bitmask: f
Launching app
EAL: Detected 4 lcore(s)
EAL: No free hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: PCI device 0000:00:19.0 on NUMA socket -1
EAL: Invalid NUMA socket, default to 0
EAL: probe driver: 8086:153b net_e1000_em
EAL: No probed ethernet devices
Interactive-mode selected
USER1: create a new mbuf pool : n=171456, size=2176,
socket=0
EAL: Error - exiting with code: 1
Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory
The test application should be build manually with make test... command. What you really want is the testpmd application to work. There are two issues:
EAL: No probed ethernet devices log means there are no NICs available for testpmd. You need to bind your NIC to igb_uio in order to use in with DPDK application.
Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory log means there are no enough huge pages to allocate mempool. Indeed:
HugePages_Free: 128
Hugepagesize: 2048 kB
There are 128 pages 2M each, which makes 256M of available memory. While testpmd tries to allocate create a new mbuf pool : n=171456, size=2176 which makes 171456 * 2176 = 373M, so it fails.
The solution would be to either allocate more huge pages or to run testpmd with --total-num-mbufs command line option.

Goal is to pull the video driver version from the display information, then compare it to a list of versions supported

The text file has following information:
System Information
------------------
Time of this report: 5/22/2014, 14:20:52
Machine name: CONFERENCE13
Operating System: Windows 7 Professional 64-bit (6.1, Build 7601) Service Pac
k 1 (7601.win7sp1_gdr.140303-2144)
Language: English (Regional Setting: English)
System Manufacturer: Mario, Inc.
System Model: Mario Virtual Platform
BIOS: PhoenixBIOS 4.0 Release 6.0
Processor: Intel(R) Xeon(R) CPU E5-2680 0 # 2.70GHz (4 CPUs), ~2.7GHz
Memory: 2048MB RAM
Available OS Memory: 2048MB RAM
Page File: 1302MB used, 2792MB available
Windows Dir: C:\Windows
DirectX Version: DirectX 11
DX Setup Parameters: Not found
User DPI Setting: Using System DPI
System DPI Setting: 96 DPI (100 percent)
DWM DPI Scaling: Disabled
DxDiag Version: 6.01.7601.17514 32bit Unicode
------------
DxDiag Notes
------------
Display Tab 1: No problems found.
Sound Tab 1: No problems found.
Input Tab: No problems found.
--------------------
DirectX Debug Levels
--------------------
Direct3D: 0/4 (retail)
DirectDraw: 0/4 (retail)
DirectInput: 0/5 (retail)
DirectMusic: 0/5 (retail)
DirectPlay: 0/9 (retail)
DirectSound: 0/5 (retail)
DirectShow: 0/6 (retail)
---------------
Display Devices
---------------
Card name: Mario SVGA 3D
Manufacturer: Mario, Inc.
Chip type: Mario Virtual SVGA 3D Graphics Adapter
DAC type: n/a
Device Key: Enum\PCI\VEN_15AD&DEV_0405&SUBSYS_040515AD&REV_00
Display Memory: 223 MB
Dedicated Memory: 35 MB
Shared Memory: 188 MB
Current Mode: 1555 x 794 (32 bit) (60Hz)
Monitor Name: Generic Non-PnP Monitor
Monitor Model: unknown
Monitor Id:
Native Mode: unknown
Output Type: HD15
Driver Name: vm3dum64.dll,vm3dum,vm3dgl64.dll,vm3dgl
Driver File Version: 7.14.0001.2032 (English)
Driver Version: 7.14.1.2032
DDI Version: unknown
Driver Model: WDDM 1.0
Driver Attributes: Final Retail
Driver Date/Size: 2/11/2014 03:15:04, 258264 bytes
WHQL Logo'd: n/aWHQL Date Stamp: n/a
Device Identifier: {D7B71B4D-4745-11CF-ED71-0424A1C2CA35}
Vendor ID: 0x15AD
Device ID: 0x0405
SubSys ID: 0x040515AD
Revision ID: 0x0000
Driver Strong Name: oem13.inf:VMware.NTamd64.6.0:VM3D_AMD64:7.14.1.2032:pci\ven_15ad&dev_0405&subsys_040515ad&rev_00
Rank Of Driver: 00F60000
Video Accel:
Deinterlace Caps: n/a
D3D9 Overlay: n/a
DXVA-HD: n/a
DDraw Status: Not Available
D3D Status: Not Available
AGP Status: Not Available
-------------
Sound Devices
-------------
Description: Speakers (Mario Virtual Audio (DevTap))
Default Sound Playback: Yes
Default Voice Playback: Yes
Hardware ID: PNPB009
Manufacturer ID: 1
Product ID: 100
Type: WDM
Driver Name: vmwvaudio.sys
Driver Version: 6.00.0000.3800 (English)
Driver Attributes: Final Retail
WHQL Logo'd: n/a
Date and Size: 11/13/2013 21:22:16, 46672 bytes
Other Files:
Driver Provider: VMware
HW Accel Level: Basic
Cap Flags: 0x0
Min/Max Sample Rate: 0, 0
Static/Strm HW Mix Bufs: 0, 0
Static/Strm HW 3D Bufs: 0, 0
HW Memory: 0
Voice Management: No
EAX(tm) 2.0 Listen/Src: No, No
I3DL2(tm) Listen/Src: No, No
Sensaura(tm) ZoomFX(tm): No**
I am trying to pull card name and driver file version from the display devices and then compare with certain list as follow:
Windows XP Windows Vista Windows 7 Windows 8 Windows 8.1 Windows Server 2008 R2
View 3.1.3 build 252693
VMware SVGA II
Version: 11.6.0.35
Dated: 4/21/2010
VMware SVGA 3D
Version: 17.14.1.42
Dated: 4/21/2010
Not Supported Not Supported Not Supported Not Supported
View 4.0.2 build 294291
VMware SVGA II
Version: 11.6.0.35
Dated: 4/21/2010
Tried awk but is giving me some error, new to awk and bash need some help thank you.
awk 'BEGIN{
FS="="; OFS=" - "; DispalyDevices=""
}
function display(){
print displaydevices,cardname,driverfileversion
}
/DisplayDevices/{
if(cardname!="") display();
cardname=""; driverfileversion=""; display=$0;
gsub("Display.*PLAY"; "Display",display)
}
/cardname/{cardname=$2}
/driverfileversion/{driverfileversion=$2}
END{display}' dx_diag.txt | cat > dx_outputfile.txt
`**
The error is:
awk: syntax error at source line 1
Within this context:
BEGIN{FS="="; OFS=" - "; DispalyDevices=""}function display(){print displaydevices,cardname,driverfileversion}/DisplayDevices/{if(cardname!="") display(); cardname=""; driverfileversion=""; display=$0; >>> gsub("Display.*PLAY"; <<<
awk: illegal statement at source line 1
awk: illegal statement at source line 1**
change
gsub("Display.*PLAY"; "Display",display)
to
gsub("Display.*PLAY", "Display",display)
#-------------------^--- comma, not semi-colon
IHTH

Resources