I got a linux server with Intel(R) Xeon(R) Silver 4110 CPU # 2.10GHz process which is up to 3.00Ghz due to Intel Turbo Boost technology.
However, when im checking cpuinfo it says that are cores are working on 2095.078Mhz like always.
There are no up's and down's - no matter what (heavy process on server etc).
(im checking it by running cat /proc/cpuinfo | grep "MHz")
For example my laptop shows different MHz for every command run.
There is also no scaling_governor setting (i wanted to set performance mode).
Running cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor tells that cpufreq folder doesnt exists at all.
Im just curious if my processor is hitting these 3.0 Turbo cuz i dont feel like it does.
I feel like im lacking some kind of drivers for frequency scaling.
It seems likely that this is your answer. If you can get the cpufreq driver loaded, you can consult the kernel docs to experiment and get the performance you want: kernel.org cpufreq doc.
Related
OS: Linux 4.4.198-1.el7.elrepo.x86_64
CPU: Intel(R) Xeon(R) Gold 6148 CPU # 2.40GHz * 4
MEM: 376 GB
I have a program(do some LSTM model inference based on TensorFlow 1.14),it runs on two machines with the same hardware, one got a bad performance while the other got a better one(about 10x diff).
I used intel pqos tool diagnosed that two processes and got a big different IPC number (one is 0.07 while the other one is 2.5), both processes are binding on some specified CPU core, and each machine payload does not heavy. This problem appears two weeks ago, before that this bad machine works as well, history command shows nothing configuration changed.
I checked many environ information, including the kernel, fs, process scheduler, io scheduler, program, and libraries md5, they are the same, the bad computer iLo show none error, the program mainly burns the CPU.
I used sysbench to test two machines(cpu & memory) which show about 25% performance difference, the bad machine does prime calculation is slower. Could it be some hardware problem?
I don't know what is the root cause resulted in the difference in IPC (equaled to performance), How can I dig into the situation?
I replaced my i5 CPU of my laptop with a i7 CPU, so that it can run faster.
But because that the power of i7 is more, and the temperature is also higher than before, my laptop crashed frequently. So, I used cpupower to specify the MAX frequency of CPU, it works.
Now, my question is "Is there a way to specify the CPU frequency as a cmd_line parameter of the linux kernel, at boot time?", so I can ensure that the system has booted stably and correctly.
Btw, if new cpu runs under the freq of 2.5GHz at most, everything is ok, and the performance is twice more than the older. so I think it is worth to change my CPU.
thanks a lot!
UPDATE - 2018-11-25
Also, I want to mention that there are below commands to use CpuFreq subsystem without using any tool (like cpufrequtils as it is used to achieve the same purpose). Sometimes these tools lack features, or they simply don't work as we want. Because CpuFreq core creates a sysfs directory under /sys/devices/system/cpu/, some attributes are available as read-write to be changed at kernel level. These attribute changes are called as policies as CpuFreq has a Policy Interface in sysfs. Below commands should work at boot time and be persistent between boot times.
If scaling governor is selected as intel_pstate; (This part may help to avoid higher frequencies if intel_pstate is decided to be used)
Also turbo can be disabled because of wanting to prevent higher frequencies.
echo "1" | sudo tee /sys/devices/system/cpu/intel_pstate/no_turbo
After this, below command can be useful.
echo "70" | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct (70 can be changed by another percentage if clock speed and turbo speed is higher numbers. 70-80 should be enough to not reaching above 2.5 GHz)
This attribute is explained as below in https://www.kernel.org/doc/Documentation/cpu-freq/intel-pstate.txt and may help to decrement higher CPU frequencies.
max_perf_pct: Limits the maximum P-State that will be requested by the
driver. It states it as a percentage of the available performance.
Because P-States are operational states and by going Pn to P0, frequencies are increasing. So, limiting maximum P-states by percent of the maximum supported performance level can be useful. Check this link: https://software.intel.com/en-us/blogs/2008/05/29/what-exactly-is-a-p-state-pt-1
Also, in intel_pstate, CPUs share same properties. While using intel_pstate as scaling governor, per-CPU performance limits as cpufreq attributes (e.g. scaling_max_freq) can be used by adding below kernel parameter;
intel_pstate=per_cpu_perf_limits
Otherwise, CPUs can be set separately;
echo -n 2457600 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq
echo -n 2457600 > /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq
echo -n 2457600 > /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq
echo -n 2457600 > /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq
But, there is an important part which is built-in script in Linux (/etc/init.d/ondemand). If ondemand or powersave is used as used as scaling governor, then configurations we set (like above) can collide with this script. The script should be disabled by below command;
sudo /usr/sbin/update-rc.d ondemand disable
Further info is in here: https://help.ubuntu.com/community/UbuntuStudio/Setting_CPU_Governor
After disabling ondemand, other scaling governors (like userspace, performance) can be set and be used by regarding above configuration.
These are all fundamental commands (both below and above part) and they should help solving CPU frequency scaling problem as I also wanted to give these information for future reference.
First of all, I want to give some information about CPU Frequency Scaling.
Three terms are related to this process (they are layers of a subsystem which is called as "CPU Performance Scaling") and they should be basically reviewed and discussed to ensure that everything is understood correctly.
CPUFreq Core
Scaling Driver
Scaling Governor
CPUFreq core is a basic framework and contains a common code infrastructure for all platforms that support this feature.
CPU frequency driver change CPU P-states that are managed by scaling governors and it communicates with hardware.
(P-States mean they are operational, in contrast of C-States, which they are idle states except C0 state. C0 state is also busy and active state.)
Scaling governors implement scaling algorithms.
By the way, CPU Performance Scaling is a deep topic and there are many things that should be considered. Basically, with the information above, below commands should meet your needs.
Firstly, I think intel_pstate is used as a scaling driver for now in your laptop. So, disabling it may provides us more advanced settings and more governors (intel_pstate has two governors that are powersave and performance). I think powersave is default governor for intel_pstate.
sudo vi /etc/default/grub
Add intel_pstate=disable to the GRUB_CMDLINE_LINUX_DEFAULT parameter.
GRUB_CMDLINE_LINUX_DEFAULT="intel_pstate=disable"
After adding the parameter execute below commands.
modprobe acpi-cpufreq
sudo update-grub
You can check kernel parameters at boot by below command
cat /proc/cmdline
By this way, acpi-cpufreq will be enabled as the scaling driver (because of disabling intel_pstate). So, the next thing can be setting governor as userspace to run the CPU as desired frequencies or letting it be as the default (ondemand should be default setting for acpi-cpufreq).
First Way of Setting Governor and Maximum Frequency Setting
If you want to change scaling governor (e.g. to userspace):
sudo update-rc.d ondemand disable (This command prevents above commands to be reset after reboot)
sudo apt install cpufrequtils (To control the CPU frequency scaling deamon)
echo 'GOVERNOR="userspace"' | sudo tee /etc/default/cpufrequtils
After these steps, we should have acpi-cpufreq as the scaling driver and ondemand (if you didn't change the governor) as the scaling governor. So, the last thing seem to be setting max frequency of the CPU.
Editing /etc/default/cpufrequtils like below should set CPU frequencies. If the file doesn't exist, create it.
MAX_SPEED="2457600"
MIN_SPEED="1536000"
Also check below lines in the same file.
ENABLE="true"
GOVERNOR="ondemand" (or userspace)
But, with this way, I think there is no guarentee for setting all CPU cores to the same frequency values. I saw some people say that below way (second way) set all CPU cores as their desired values but not first way.
Second Way of Setting Governor and Maximum Frequency Setting
Install tlp (Linux Power management tool)
sudo apt install tlp
After installing, edit /etc/default/tlp like below:
# Select a CPU frequency scaling governor: # ondemand, powersave,
performance, conservative # Intel Core i processor with intel_pstate
driver: # powersave, performance # Important: # You must
disable your distribution's governor settings or conflicts will #
occur. ondemand is sufficient for almost all workloads, you should
know # what you're doing! CPU_SCALING_GOVERNOR_ON_AC=ondemand
CPU_SCALING_GOVERNOR_ON_BAT=ondemand
# Set the min/max frequency available for the scaling governor. #
Possible values strongly depend on your CPU. For available frequencies
see # tlp-stat output, Section "+++ Processor".
CPU_SCALING_MIN_FREQ_ON_AC=0
CPU_SCALING_MAX_FREQ_ON_AC=0
CPU_SCALING_MIN_FREQ_ON_BAT=1536000
CPU_SCALING_MAX_FREQ_ON_BAT=2457600
Above settings should be kept after restarting or suspending the device.
I have tried to provide and explain ways to set the CPU frequency (also to keep settings persistent) and I may have forgotten something. So, please check the information above and try if these meet your needs. Also, you can use below command to ensure that everything is right.
cpufreq-info
Note: Please check below pages for more information.
Governors list
https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt
https://www.kernel.org/doc/html/v4.14/admin-guide/pm/cpufreq.html
https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html
eventually I have time to reply this because I'm busy for doing other things.
I tried all of above solutions, and choosed "tlp + lm-sensors + psensor".
The following is my opinions:
cpupower is a simple but relatively poor of features tool, it only can set the MAX/MIN frequenc of CPU and the governor.
cpufrequtils is basically same as cpupower, except that it base on
acpi drivers, not the Intel genuin one. I guess a Intel genuin
driver with p_state support should be better choice for Intel CPU.
tlp is my choice at last, it has more features to monitor/throttle
the temperature and frequence of CPU, and more configurable options.
Yes, as Erdem Savasci said, with tlp the MAX/MIN freqs of all CPU cores can be set within one step, while those can NOT do with cpufrequtils.
In addition, I installed the lm-sensors and psensor. The former can be think as a driver for querying the temperature/frequence/Fan-speed, the latter is a GUI panel that can show information as above mentioned.
With these tools, I belive that my cpu would be running stablly.
But the solution to "ensure CPU run stablly AT BOOT TIME" has not be found yet.
All of above are started after boot, aren't they?
Sorry for my poor english, I'm a Chinese. Hope I has expressed correctly things.
Thanks again!
It is known the way to disable logical CPUs in Linux, basically with echo 0 > /sys/devices/system/cpu/cpu<number>/online. This way, you are only telling to the OS to ignore that given (<number>) CPU.
My question goes further, is it possible not only to ignore it but to turn it off physically programmatically? I want that CPU to not receive any power, in order to make its energy consumption zero.
I know that it is possible disable cores from the BIOS (not always), but I want to know whether is possible to do it within a certain program or not.
When you do echo 0 > /sys/devices/system/cpu/cpu<number>/online, what happens next depends on the particular CPU. On ARM embedded systems the kernel will typically disable the clock that drives the particular core PLL so effectively you get what you want.
On Intel X86 systems, you can only disable the interrupts and call the hlt instruction (which Linux Kernel does). This effectively puts CPU to the power-saving state until it is woken up by another CPU at user request. If you have a laptop, you can verify that power draw indeed goes down when you disable the core by reading the power from /sys/class/power_supply/BAT{0,1}/current_now (or uevent for all values such as voltage) or using the "powertop" utility.
For example, here's the call chain for disabling the CPU core in Linux Kernel for Intel CPUs.
https://github.com/torvalds/linux/blob/master/drivers/cpufreq/intel_pstate.c
arch/x86/kernel/smp.c: smp_ops.play_dead = native_play_dead,
arch/x86/kernel/smpboot.c : native_play_dead() -> play_dead_common() -> local_irq_disable()
Before that, CPUFREQ also sets the CPU to the lowest power consumption level before disabling it though this does not seem to be strictly necessary.
intel_pstate_stop_cpu -> intel_cpufreq_stop_cpu -> intel_pstate_set_min_pstate -> intel_pstate_set_pstate -> wrmsrl_on_cpu(cpu->cpu, MSR_IA32_PERF_CTL, pstate_funcs.get_val(cpu, pstate));
On Intel X86 there does not seem to be an official way to disable the actual clocks and voltage regulators. Even if there was, it would be specific to the motherboard and thus your closest bet might be looking into BIOS such as coreboot.
Hmm, I realized I have no idea about Intel except looking into kernel sources.
In Windows 10 it became possible with new power management commands CPMINCORES CPMAXCORES.
Powercfg -setacvalueindex scheme_current sub_processor CPMAXCORES 50
Powercfg -setacvalueindex scheme_current sub_processor CPMINCORES 25
Powercfg -setactive scheme_current
Here 50% of cores are assigned for desired deep sleep, and 25% are forbidden to be parked. Very good in numeric simulations requiring increased clock rate (15% boost on Intel)
You can not choose which cores to park, but Windows 10 kernel checks Intel's Comet Lake and newer "prefered" (more power efficient) cores, and starts parking those not preferred.
It is not a strict parking, so at high load the kernel can use these cores with very low load.
just in case if you are looking for alternatives
You can get closest to this by using governors like cpufreq. Make Linux exclude the CPU and power saving mode will ensure that the core runs at minimal frequency.
You can also isolate cpus from the scheduler at kernel boot time.
Add isolcpus=0,1,2 to the kernel boot parameters.
https://www.linuxtopia.org/online_books/linux_kernel/kernel_configuration/re46.html
I know this is an old question but one way to disable the CPU is via grub config.
If you add to end of GRUB_CMDLINE_LINUX in /etc/default/grub (assuming you are using a standard Linux dist, if you are using an appliance the location of the grub config may be different), e.g.:
GRUB_CMDLINE_LINUX=".......Current config here **maxcpus**=2"
Then remake you grub config by running
grub2-mkconfig -o /boot/grub2/grub.cfg (or grub-mkconfig -o /boot/grub2/grub.cfg depending on your installation). Some distros may require nr_cpus instead of maxcpus.
Just some extra info:
If you are running a server with Multiple physical CPU then disabling one CPU may will most likely disable the memory set that is linked to that CPU, therefore it may have an effect on the performance of the server
Disabling the CPU this way, will not effect your type 1 hypervisor from accessing the CPU (this is based on xen hypervisor, I believe it will apply to vmware as well, if anyone can provide confirmation would be great). Depending on virtualbox setup, it may restrict the amount of CPU you can allocate to VM's unless you are running para-virtualization.
I am unsure however if you will have any power savings, most servers and even desktops these days, already control the power well, putting to sleep any device not needed for the current load. My concern would be by reducing the number of CPU (cores) then you will just be moving the load to the remaining CPU and due to the need to schedule the processors time, and potentially having instructions queued, and the effect of having a smaller number of cores available for interrupts (eg: network traffic), it may have a negative effect on power consumption.
AFAIK there is no system call or library function available as of now. or even ioctl implementation. So apart from creating new module / system call there are two ways I can think of :
using ASM asm(<assembly code>); where assembly code being architecture specific asm code to modify cpu flag.
system call in c (man 3 system). Assuming you just want to do it through c.
Processor: AM335x 1GHz ARM® Cortex-A8, DDR3 RAM
Somehow i might be missing the correct kernel setting or just the knowledge on where to look in /sys/...
I already tried to get decode-dimms running and sensors-detect. So fare without much success. I changed the ram timings. But now I'd like to see the actual setting at runtime. Is is even possible using this setup?
I have search the various questions (and web) but did not find any satisfactory answer.
I am curious about whether to use threads to directly load the cores of the CPU or use an OpenCL implementation. Is OpenCl just there to make multi processors/cores just more portable, meaning porting the code to either GPU or CPU or is OpenCL faster and more efficient? I am aware that GPU's have more processing units but that is not the question. Is it indirect multi threading in code or using OpneCL?
Sorry I have another question...
If the IGP shares PCI lines with the Descrete Graphics Card and its drivers can not be loaded under Windows 7, I have to assume that it will not be available, even if you want to use the processing cores of the integrated GPU only. Is this correct or is there a way to access the IGP without drivers.
EDIT: As #Yann Vernier point out in the comment section, I haven't be strict enough with the terms I used. So in this post I use the term thread as a synonym of workitem. I'm not refering to the CPU threads.
I can’t really compare OCL with any other technologies that will allow using the different cores of a CPU as I only used OCL so far.
However I might bring some input about OCL especially that I don’t really agree with ScottD.
First of all, even though an OCL kernel developed to run on a GPU will run as well on a CPU it doesn’t mean that it’ll be efficient. The reason is simply that OCL doesn’t work the same way on CPU and GPU. To have a good understanding of how it differs, see the chap 6 of “heterogeneous computing with opencl”. To summary, while the GPU will launch a bunch of threads within a given workgroup at the same time, the CPU will execute on a core one thread after another within the same workgroup. See as well the point 3.4 of the standard about the two different types of programming models supported by OCL. This can explain why an OCL kernel could be less efficient on a CPU than a “classic” code: because it was design for a GPU. Whether a developer will target the CPU or the GPU is not a problem of “serious work” but is simply dependent of the type of programming model that suits best your need. Also, the fact that OCL support CPU as well is nice since it can degrade gracefully on computer not equipped with a proper GPU (though it must be hard to find such computer).
Regarding the AMD platform I’ve noticed some problem with the CPU as well on a laptop with an ATI. I observed low performance on some of my code and crashes as well. But the reason was due to the fact that the processor was an Intel. The AMD platform will declare to have a CPU device available even if it is an Intel CPU. However it won’t be able to use it as efficiently as it should. When I run the exact same code targeting the CPU but after installing (and using) the Intel platform all the issues were gone. That’s another possible reason for poor performance.
Regarding the iGPU, it does not share PCIe lines, it is on the CPU die (at least of Intel) and yes you need the driver to use it. I assume that you tried to install the driver and got a message like” your computer does not meet the minimum requirement…” or something similar. I guess it depends on the computer, but in my case, I have a desktop equipped with a NVIDIA and an i7 CPU (it has an HD4000 GPU). In order to use the iGPU I had first to enable it in the BIOS, which allowed me to install the driver. Of Course only one of the two GPU is used by the display at a time (depending on the BIOS setting), but I can access both with OCL.
In recent experiments using the Intel opencl tools we experienced that the opencl performance was very similar to CUDA and intrincics based AVX code on gcc and icc -- way better than earlier experiments (some years ago) where we saw opencl perform worse.