Tensorflow 2.4 doesn't work despite my cpu having AVX support

Tensorflow 2.4 doesn't work despite my cpu having AVX support - python-3.x

I am running Ubuntu 20.04.1 LTS/tf version 2.4.0, and I am not able to run the tensorflow library, because it always results in an error
This is the only line that I put in
import tensorflow as tf
This is the error it gives out
Illegal instruction (core dumped)
These are the processor specs
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i5-3437U CPU # 1.90GHz
stepping : 9
microcode : 0x21
cpu MHz : 842.451
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
bogomips : 4789.04
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
If more information will be needed I will provide it.

I have the same problem running the tf version 2.4.0 in Ubuntu 18.04.4 LTS. I have been looking for a solution but I didn't find it yet so, for now, I am using the previous version which works for me.
pip uninstall tensorflow
pip install tensorflow==2.3.1

It should be fixed now in 2.4.1

Related

How to enable Intel TSX in i7-7700 cpu?

I have this Intel i7-7700 CPU. lscpu shows it has Intel TSX feature.
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi
mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs
bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx
est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer
aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb invpcid_single intel_pt ssbd ibrs ibpb stibp
kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms
invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp
hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities
However, I'm unable to use it. I was trying to run the test program from this repository https://github.com/blue9057/intel-tsx-sample. I check IA32_ARCH_CAPABILITIES with sudo rdmsr 0x10AH, and it returns 0xc04 which indicates bit 7 (TSX_CTRL) is 0. So there is no TSX support. I'm not an expert, I tried to change the MSR with sudo wrmsr 0x10ah 0xC84 and I get the following error,
wrmsr: CPU 0 cannot set MSR 0x0000010a to 0x0000000000000c84
I also try to read IA32_TSX_CTRL with sudo rdmsr 0x122H and get the following error,
rdmsr: CPU 0 cannot read MSR 0x00000122
I disable secure boot from BIOS
lscpu shows the microcode: 0xf0. I believe Intel disable this TSX feature with microde update. I'm not sure if this update is automatic or not. But as far I can remember I did not update any.
Not if this is relevent or not /usr/src/linux-headers-4.4.0-210/arch/x86/include/asm/msr-index.h files shows
#define MSR_IA32_TSX_CTRL 0x00000122
#define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */
#define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
I also comment out #define TSX_CTRL_RTM_DISABLE and #define TSX_CTRL_CPUID_CLEAR but no luck.
I was wondering if there is any way I can enable this feature. I'm not sure If I provide enough information on this. Please let me know what other information will be helpful. I will add them. Or if you can point me in some directions that will be helpful.
OS version: Linux xxxx 4.4.0-210-generic x86_64
UPDATE
complete lscpu
xxx#xxx:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Thread(s) per core: 2
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-7700 CPU # 3.60GHz
Stepping: 9
CPU MHz: 900.000
CPU max MHz: 4200.0000
CPU min MHz: 800.0000
BogoMIPS: 7199.80
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 8192K
NUMA node0 CPU(s): 0-7
Flags: fpu vme de pse tsc msr pae mce cx8 apic
sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse
sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art
arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc
aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2
ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe
popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm
3dnowprefetch epb invpcid_single intel_pt ssbd ibrs ibpb stibp
kaiser tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust
bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap
clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp
hwp_notify hwp_act_window hwp_epp md_clear flush_l1d
arch_capabilities
Kernel log shows,
$dmesg | grep "tsx=on"
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-210-generic root=UUID=b528525a-2daa-4537-964e-106314e6a7bd ro quiet splash nokaslr tsx_async_abort=off tsx=on vt.handoff=7
[ 0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-4.4.0-210-generic root=UUID=b528525a-2daa-4537-964e-106314e6a7bd ro quiet splash nokaslr tsx_async_abort=off tsx=on vt.handoff=7
$dmesg | grep "TAA"
[ 0.063012] TAA: Mitigation: Clear CPU buffers
[ 0.209727] TAA CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html for more details.
Instead of using the example from the mentioned git repo, I was trying to run the following program,
#include <stdio.h>
#include <stdlib.h>
#include <immintrin.h>
int main(int argc, char *argv[]){
volatile int result=-1;
unsigned status;
while(result != 1){
if ((status = _xbegin()) == _XBEGIN_STARTED) {
result=1;
_xend();
}else{
printf("rtmCheck: Transaction failed\n");
printf("Trying again ...\n");
sleep(5);
}
}
printf("rtmCheck : Result is %d\n", result);
return 0;
}
Makefile
CC=gcc
CFLAGS= -mrtm
rtmCheck: rtmCheck.o
$(CC) $(CFLAGS) -o rtmCheck rtmCheck.o
clean:
rm -f *.o rtmCheck
output:
rtmCheck: Transaction failed
Trying again ...
rtmCheck: Transaction failed
Trying again ...

could and how update cpu features flags in linux for openstack live migrate instance

I have three hypervisor with two type CPU process:
# hypervisor node-3:
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6248R CPU # 3.00GHz
stepping : 7
microcode : 0x5003003
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit
# hypervisor node-4:
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6240 CPU # 2.60GHz
stepping : 7
microcode : 0x5003102
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke avx512_vnni md_clear flush_l1d arch_capabilities
bugs : spectre_v1 spectre_v2 spec_store_bypass swapgs taa itlb_multihit
# hypervisor node-5:
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Gold 6240 CPU # 2.60GHz
stepping : 7
microcode : 0x5003003
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single intel_ppin mba tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local ibpb ibrs stibp dtherm ida arat pln pts pku ospke avx512_vnni arch_capabilities
bugs : spectre_v1 spectre_v2
I could live migrate instances between node-3 and node-4, but node-5 can't live migrate with all other hypervisors.
I know that because of the flags are not consistent, check from Host model (default for KVM & QEMU) .
Question:
1, could I update the CPU's flags ? and HOW ?
2, why node-3 has different cpu model could have the same flags with node-4 ? node-4 and node-5 has the same cpu model but have the different flags ?
3, what's the microcode function ? why only node-4 different with others ?
The root cause by didn't update the kernel version.
Solve it by:
apt-get install -y intel-microcode linux-image-generic
and reboot the hypervisor.

The CPU model and flags are part of the guest visible ABI. As such, they can only be set at cold boot. IOW to change them, you'll need to power off the guest, update the <cpu> config as needed, and then boot the guest once more. Of course if you're having to cold boot, then there's little point in live migration - just boot it on the required host.
The microcode is something that is loaded into CPUs on host boot, essentially as a way to provide bug fixes to your physical CPUs. Typically updated microcode either gets installed when you update your BIOS / firmware, or when the operating system vendor sends updated microcode packages. Microcode can both add and remove CPU features, depending on what its trying to fix. Removing features is a problem for live migration, as it can prevent live migration from hosts without the microcode update, to hosts with the microcode udpate.
Clearly some of your nodes have received updated microcode, but others have not.

Why I have no scaling_governor?

I am running v8's benchmark program, and I run the following command
./tools/cpu.sh fast
It prints out
Setting CPU frequency governor to "ondemand"
./tools/cpu.sh: line 13: /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor: no such file or directory
And I run
# ls /sys/devices/system/cpu/cpu0
cache crash_notes crash_notes_size microcode node0 power subsystem topology uevent
And find there is no "cpufreq"
After some searching, I find that I should install cpufrequtils, and I run
yum install cpufrequtils
After that, no thing works. So I wonder what is wrong here.
My system is
# lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.2 (Final)
Release: 7.2
Codename: Final
And my cpuinfo is
cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 94
model name : Intel(R) Xeon(R) Gold 61xx CPU
stepping : 3
microcode : 0x1
cpu MHz : 2499.998
cache size : 4096 KB
physical id : 0
siblings : 1
core id : 0
cpu cores : 1
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap xsaveopt xsavec xgetbv1 arat
bogomips : 4999.99
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

It depends on your kernel configuration whether the governor interface is exposed at all, and which governors are available. I don't know any specifics about CentOS. On Debian/Ubuntu the governors should be available by default (last I checked, they were).
Maybe https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/power_management_guide/cpufreq_governors helps?

How to find the L3 Cache parameters on Intel CPUs?

I would like to know the following L3 cache parameters. But not sure how to get them, I also pasted the /proc/cpuinfo output (4 processors, only pasted the first one, the others are repetitive.)
CACHE_SIZE
LINE_SIZE
Associativity
$ cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 58
model name : Intel(R) Core(TM) i7-3520M CPU # 2.90GHz
stepping : 9
microcode : 0x15
cpu MHz : 1200.000
cache size : 4096 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms
bogomips : 5786.68
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
UPDATE 1: It seems that someone posted the cache size and associativity here:
http://www.cpu-world.com/CPUs/Core_i7/Intel-Core%20i7-3520M%20%28PGA%29%20Mobile%20processor.html
But still I dont know the line size.

Hwloc / lstopo
Hwloc (Portable hardware locality) is a small utility that reports the structure of the processor in a neat visual diagram. The diagram shows the number of cores, hyperthreads and cache size. A single diagram tells it all.
$ sudo apt-get install hwloc
$ hwloc

what's mean 'the use of this feature is restricted'?

I used the Xenserver 6.1 in my two servers.
I want to use live-migration.
But, they can't to join same resource pool .
So, I use cpu-masking feature.
However, it isn't working, too.
My first server info is..
[server-1]
cpu_count : 32
vendor: GenuineIntel
speed: 2000.066
modelname: Intel(R) Xeon(R) CPU E5-2640 v2 # 2.00GHz
family: 6
model: 62
stepping: 4
flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat
clflush acpi mmx fxsr sse sse2 ss ht nx constant_tsc
nonstop_tsc aperfmperf pni pclmulqdq vmx est ssse3 sse4_1 sse4_2
x2apic popcnt aes hypervisor ida arat tpr_shadow vnmi
flexpriority ept vpid
features: 77bee3ff-bfebfbff-00000001-2c100800
features_after_reboot: 77bee3ff-bfebfbff-00000001-2c100800
physical_features: 77bee3ff-bfebfbff-00000001-2c100800
maskable: no
My second server info is..
[server-2]
cpu_count : 24
vendor: GenuineIntel
speed: 2000.040
modelname: Intel(R) Xeon(R) CPU E5-2620 0 # 2.00GHz
family: 6
model: 45
stepping: 7
flags: fpu de tsc msr pae mce cx8 apic sep mtrr mca cmov pat clflush
acpi mmx fxsr sse sse2 ss ht nx constant_tsc nonstop_tsc
aperfmperf pni pclmulqdq vmx est ssse3 sse4_1 sse4_2 x2apic
popcnt aes hypervisor ida arat tpr_shadow vnmi flexpriority
ept vpid
features: 17bee3ff-bfebfbff-00000001-2c100800
features_after_reboot: 17bee3ff-bfebfbff-00000001-2c100800
physical_features: 17bee3ff-bfebfbff-00000001-2c100800
maskable: full
I use this command in my server1.
xe host-set-cpu-features features=17bee3ff-bfebfbff-00000001-2c100800 uuid=6c91e5c8-06b9-4b5c-a41d-ec4d6b2c44aa
Result is 'The use of this feature is restricted'.
And, I use this command in my server2.
xe host-set-cpu-features features=77bee3ff-bfebfbff-00000001-2c100800 uuid=53566e64-a24f-42a4-8a6d-a26e9f740fa8
Result is same.
What's mean this message?
'The use of this feature is restricted'.
How to use cpu-masking in my environment?

I encountered the same error message when trying to join a host with a cpu model E5503 to a pool with two hosts with E5645's. I was not able to get past this error with XenServer 6.1, but after upgrading the pool and the lone host to 6.2 I was able to apply a mask and join the third host to the pool without further issue.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Tensorflow 2.4 doesn't work despite my cpu having AVX support - python-3.x

I have the same problem running the tf version 2.4.0 in Ubuntu 18.04.4 LTS. I have been looking for a solution but I didn't find it yet so, for now, I am using the previous version which works for me. pip uninstall tensorflow pip install tensorflow==2.3.1

It should be fixed now in 2.4.1

Related

How to enable Intel TSX in i7-7700 cpu?

could and how update cpu features flags in linux for openstack live migrate instance

Why I have no scaling_governor?

How to find the L3 Cache parameters on Intel CPUs?

what's mean 'the use of this feature is restricted'?

Categories

Resources