compiler flags on ARM based processors - linux

I am compiling C++ source code on Nvidia Jetson nano. Some details of the processor are as follows:
cat /proc/cpuinfo
processor : 0
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 1
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 2
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
processor : 3
model name : ARMv8 Processor rev 1 (v8l)
BogoMIPS : 38.40
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd07
CPU revision : 1
I get the following error when compiling the code;
c++: error: unrecognized command line option _-mfpu=neon_
c++: error: unrecognized command line option _-mfpu=neon_
I want to know based on the above following specifications; what flags should I set for mfpu?

For ARMv8-A profile processors the FPU and NEON are mandatory parts of the architecture (unlike ARMv7-A) so you shouldn't need a -mfpu flag to enable it.

Related

Tasklet counts in /proc/softirqs increase very rapidly on USB operation in Linux

I have a legacy device with following configuration:
Chipset Architecture : Intel NM10 express
CPU : Atom D2250 Dual Core
Volatile Memory : 1GB
CPU core : 4
USB Host controller driver : ehci-pci
When I perform any USB operation I observe increasing tasklet count linearly and if USB operation continues for a long time(approx half an hour) then tasklet count crosses a million and this seems to be very strange to me.
I read about interrupt handling mechanism used by ehci-pci which is not latest(i.e. PCI pin-based) but still the tasklet counts are so high in numbers.
I use /proc/softirqs to read tasklet count.
Any lead on this?
root#panther1:~# cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
23: 0 0 0 1411443 IO-APIC 23-fasteoi ehci_hcd:usb1, uhci_hcd:usb2
root#panther1:~# cat /proc/softirqs
CPU0 CPU1 CPU2 CPU3
HI: 7 3 13 1125529
TIMER: 2352846 2325384 2533628 2675821
NET_TX: 0 0 2 1703
NET_RX: 1161 1193 2730 51184
BLOCK: 0 0 0 0
IRQ_POLL: 0 0 0 0
TASKLET: 256 164 90 1162220
SCHED: 1078965 1015261 1155661 1207484
HRTIMER: 0 0 0 0
RCU: 1370647 1367098 1485356 1503762

nanosleep sleep 60 microseconds too long

I have the following test compiled in g++ which nanosleep too long , it takes
60 microseconds to finish , I expected it cost only less than 1 microsecond :
int main()
{
gettimeofday(&startx, NULL);
struct timespec req={0};
req.tv_sec=0;
req.tv_nsec=100 ;
nanosleep(&req,NULL) ;
gettimeofday(&endx, NULL);
printf("(%d)(%d)\n",startx.tv_sec,startx.tv_usec);
printf("(%d)(%d)\n",endx.tv_sec,endx.tv_usec);
return 0 ;
}
My environment : uname -r showes :
3.10.0-123.el7.x86_64
cat /boot/config-uname -r | grep HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
# CONFIG_NO_HZ_FULL_ALL is not set
CONFIG_NO_HZ=y
# CONFIG_RCU_FAST_NO_HZ is not set
# CONFIG_HZ_100 is not set
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
CONFIG_HZ_1000=y
CONFIG_HZ=1000
CONFIG_MACHZ_WDT=m
Should I do something in HZ config so that nanosleep will do exactly I expect?!
my cpu information :
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 63
Model name: Intel(R) Xeon(R) CPU E5-2643 v3 # 3.40GHz
Stepping: 2
CPU MHz: 3600.015
BogoMIPS: 6804.22
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 20480K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23
Edit :
#ifdef SYS_gettid
pid_t tid = syscall(SYS_gettid);
printf("thread1...tid=(%d)\n",tid);
#else
#error "SYS_gettid unavailable on this system"
#endif
this will get tid of the thread I like to high priority , then do
chrt -v -r -p 99 tid to achieve this goal , thanks for Johan Boule
kind help , great appreciate !!!
Edit2 :
#ifdef SYS_gettid
const char *sched_policy[] = {
"SCHED_OTHER",
"SCHED_FIFO",
"SCHED_RR",
"SCHED_BATCH"
};
struct sched_param sp = {
.sched_priority = 99
};
pid_t tid = syscall(SYS_gettid);
printf("thread1...tid=(%d)\n",tid);
sched_setscheduler(tid, SCHED_RR, &sp);
printf("Scheduler Policy is %s.\n", sched_policy[sched_getscheduler(0)]);
#else
#error "SYS_gettid unavailable on this system"
#endif
this will do exact what I want to do without help of chrt
(This is not an answer)
For what is worth, using the more modern functions leads to the same result:
#include <stddef.h>
#include <time.h>
#include <stdio.h>
int main()
{
struct timespec startx, endx;
clock_gettime(CLOCK_MONOTONIC, &startx);
struct timespec req={0};
req.tv_sec=0;
req.tv_nsec=100 ;
clock_nanosleep(CLOCK_MONOTONIC, 0, &req, NULL);
clock_gettime(CLOCK_MONOTONIC, &endx);
printf("(%d)(%d)\n",startx.tv_sec,startx.tv_nsec);
printf("(%d)(%d)\n",endx.tv_sec,endx.tv_nsec);
return 0 ;
}
Output:
(296441)(153832940)
(296441)(153888488)

Which of the below is my CPU temperature

Goal
Measure the CPU temperature of my Linux Box.
Work done till now
I have installed lm-sensors to detect the temperature and below is the output of the command sensor:
root#XXXX-XX :# sensors
acpitz-virtual-0
Adapter: Virtual device
temp1: +66.0°C (crit = +255.0°C)
k10temp-pci-00c3
Adapter: PCI adapter
temp1: +65.4°C (high = +70.0°C)
(crit = +100.0°C, hyst = +99.0°C)
radeon-pci-0008
Adapter: PCI adapter
temp1: +64.0°C (crit = +120.0°C, hyst = +90.0°C)
radeon-pci-0100
Adapter: PCI adapter
temp1: N/A (crit = +120.0°C, hyst = +90.0°C)
The output of the cat /proc/cpuinfo is:
processor : 0
vendor_id : AuthenticAMD
cpu family : 21
model : 16
model name : AMD A8-4500M APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001116
cpu MHz : 1400.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 16
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
.
.
.
processor : 1
vendor_id : AuthenticAMD
cpu family : 21
model : 16
model name : AMD A8-4500M APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001116
cpu MHz : 1400.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 1
cpu cores : 2
apicid : 17
initial apicid : 1
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
processor : 2
vendor_id : AuthenticAMD
cpu family : 21
model : 16
model name : AMD A8-4500M APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001116
cpu MHz : 1400.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 2
cpu cores : 2
apicid : 18
initial apicid : 2
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
processor : 3
vendor_id : AuthenticAMD
cpu family : 21
model : 16
model name : AMD A8-4500M APU with Radeon(tm) HD Graphics
stepping : 1
microcode : 0x6001116
cpu MHz : 1400.000
cache size : 2048 KB
physical id : 0
siblings : 4
core id : 3
cpu cores : 2
apicid : 19
initial apicid : 3
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
Question
Out of the above readings from command sensor and cat /proc/cpuinfo, I am not sure which is exactly my cpu temperature in the output of sensors and what is the one to one relation between the output of both the commands (ie: which field (ie:output of sensor) matches with the field of cat /proc/cpuinfo).
Your cpu temperature is shown by the k10 sensor, but beware it is assumed to be inaccurate.
https://www.kernel.org/doc/Documentation/hwmon/k10temp

why in Linux VM like vmware and XenServer, physical id > the number of cores in /proc/cpuinfo

This was only observed in Linux virtual machines. In /proc/cpuinfo, the physical id could be very large, exceeding the number of cpus.
For below example, 4 core systems but the physical id is 13. Also had another virtual machine, only 2 cores, but one physical id is 2.
Anyone has idea how virtual machine defines the linux physical ID?
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2690 0 # 2.90GHz
stepping : 7
microcode : 1808
cpu MHz : 2900.040
cache size : 20480 KB
physical id : 13
siblings : 1
core id : 0
cpu cores : 1
apicid : 13
initial apicid : 13
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt tsc_deadline_timer aes hypervisor lahf_lm arat epb pln pts dts
bogomips : 5800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 1
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2690 0 # 2.90GHz
stepping : 7
microcode : 1808
cpu MHz : 2900.040
cache size : 20480 KB
physical id : 13
siblings : 1
core id : 0
cpu cores : 1
apicid : 13
initial apicid : 13
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt tsc_deadline_timer aes hypervisor lahf_lm arat epb pln pts dts
bogomips : 5800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 2
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2690 0 # 2.90GHz
stepping : 7
microcode : 1808
cpu MHz : 2900.040
cache size : 20480 KB
physical id : 13
siblings : 1
core id : 0
cpu cores : 1
apicid : 13
initial apicid : 13
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt tsc_deadline_timer aes hypervisor lahf_lm arat epb pln pts dts
bogomips : 5800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
processor : 3
vendor_id : GenuineIntel
cpu family : 6
model : 45
model name : Intel(R) Xeon(R) CPU E5-2690 0 # 2.90GHz
stepping : 7
microcode : 1808
cpu MHz : 2900.040
cache size : 20480 KB
physical id : 13
siblings : 1
core id : 0
cpu cores : 1
apicid : 13
initial apicid : 13
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu de tsc msr pae cx8 sep cmov pat clflush mmx fxsr sse sse2 ss ht syscall nx lm constant_tsc rep_good unfair_spinlock pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 popcnt tsc_deadline_timer aes hypervisor lahf_lm arat epb pln pts dts
bogomips : 5800.08
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

What does the "model" field represent in the file /proc/cpuinfo?

The following is the content of my /proc/cpuinfo file for a single core:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i3-2100 CPU # 3.10GHz
stepping : 7
microcode : 0x1a
cpu MHz : 1600.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer xsave avx lahf_lm arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips : 6185.67
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:
Can anyone please help, what does the "model" field represent. What does that "42" imply?
Thanks.
the model is a numbering system to describe the CPU generation and series. family 6/ model 42 (0x2A) is a 32 nanometer sandy bridge architecture. See https://software.intel.com/en-us/articles/intel-architecture-and-processor-identification-with-cpuid-model-and-family-numbers

Resources