We have built a graphql API , with our services written in node.js and leveraging apollo server. We are experiencing high CPU usage whenever requests per sec reach 20. We did profiling with flamegraphs and node built in profiler. Attaching the result of the built in profiler:-
[Summary]:
ticks total nonlib name
87809 32.1% 95.8% JavaScript
0 0.0% 0.0% C++
32531 11.9% 35.5% GC
182061 66.5% Shared libraries
3878 1.4% Unaccounted
[Shared libraries]:
ticks total nonlib name
138326 50.5% /usr/bin/node
30023 11.0% /lib/x86_64-linux-gnu/libc-2.27.so
12466 4.6% /lib/x86_64-linux-gnu/libpthread-2.27.so
627 0.2% [vdso]
567 0.2% /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.25
52 0.0% /lib/x86_64-linux-gnu/libm-2.27.so
Results from flamegraph also complement the above result that we didn't see any javascript function consuming high CPU.
Why /usr/bin/node is consuming so much CPU? has it something to do with the way code has been written or it is in general the trend?
Also to give little info about what our graphQL API Does:- upon receiving a request, depending on the request it makes 3 to 5 downstream API calls and doesn't do any CPU intensive work on it's own.
Versions:-
Node version:- 10.16.3
graphql-modules/core:- 0.7.7
apollo-datasource-rest:- 0.5.0
apollo-server-express:- 2.6.8
A help is really appreciated here.
I am using a node application that is experiencing a performance problem under certain loads. I am attempting to use the V8 profiler to find out where the problem might be, basically following this guide.
I've generated a log file during the problem load using node --prof app.js, and analyzed it with node --prof-process isolate-0xnnnnnnnnnnnn-v8.log > processed.txt. This all seems to work fine, but it seems that almost all the ticks are spent in the node executable itself:
[Summary]:
ticks total nonlib name
3887 5.8% 38.2% JavaScript
5590 8.4% 55.0% C++
346 0.5% 3.4% GC
56296 84.7% Shared libraries
689 1.0% Unaccounted
and:
[Shared libraries]:
ticks total nonlib name
55990 84.2% /usr/bin/node
225 0.3% /lib/x86_64-linux-gnu/libc-2.19.so
68 0.1% /lib/x86_64-linux-gnu/libpthread-2.19.so
7 0.0% /lib/x86_64-linux-gnu/libm-2.19.so
4 0.0% [vdso]
2 0.0% /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20
What does this mean? What is the app spending all its time doing? How can I find the performance problem?
I would suggest to try VTune Amplifier as alternative of the V8 profiler. I was able to identify and fix the time-consuming place in my code. You can download free trial version here and just follow by this step-by-step instructions. Hope it will help you.
I have a few questions regarding the report of lighthouse (see screenshot below)
The first is a culture think: i assume the value 11.930 ms stands for 11 seconds and 930 ms. Is this the case?
The second is the Delayed Paint compared to the size. The third entry (7.22 KB) delays the paint by 3,101 ms the fourth entry delays the paint by 1,226 ms although the javascript file is more than three times the size 24.03 KB versus 7.22 KB. Does anybody know what might be the cause?
Screenshot of Lighthouse
This is an extract of a lighthouse report. In the screenshot of google-chrome-lighthouse you can see that a few metrics are written with a comma 11,222 ms and others with a full stop 7.410 ms
Thank you for discovering quite a bug! An issue has been filed in the Lighthouse GitHub repo.
To explain what's likely going on, it looks like this report was generated with the CLI (or at least a locale that is different from the one that it is being displayed in). Some numbers (such as the ones in the table) are converted to strings ahead of time while others are converted at display time in the browser. The browser numbers are respecting your OS/user-selected locale while the pre-stringified numbers are not.
To answer your questions...
Yes, the value it's reporting is 11930 milliseconds or 11 seconds and 930 milliseconds (11,930 ms en-US or 11.930 ms de-DE).
The delayed paint metric is reporting to you how many milliseconds after the load started the asset finished loading. There are multiple factors that influence this number including when the asset was discovered by the browser, queuing time, server response time, network variability, and payload size. The small script that delayed paint longer likely had a lower priority or was added to your page later than the larger script was and thus was pushed out farther.
I'm almost getting the handle of GHC cost centres.... it is an awesome idea, and you can actually fix memory leaks with their profiling tools. But my problem is that the information I'm getting in the .hp profiling is too truncated:
(1319)GHC.Conc.Signal.CAF 640
(1300)GHC.Event.Thread.CAF 560
(2679)hGetReplies/connect/c... 112
(2597)insideConfig/CAF:lvl2... 32
(1311)GHC.IO.Handle.FD.CAF 656
(2566)setLoggerLevels/confi... 208
(2571)configureLoggingToCon... 120
(2727)reply/Database.Redis.... 32
How do I know for example what is the full cost centre stack in (2566) or (2559)? Is there a tool for that or a command-line option?
Pass +RTS -L100 to your the program when you run it with profiling, and change 100 to whatever number of characters you want to see of your cost centres.
The documentation can be found in the GHC user guide, section “RTS options for heap profiling”.
In the static vs shared libraries debates, I've often heard that shared libraries eliminate duplication and reduces overall disk space. But how much disk space do shared libraries really save in modern Linux distros? How much more space would be needed if all programs were compiled using static libraries? Has anyone crunched the numbers for a typical desktop Linux distro such as Ubuntu? Are there any statistics available?
ADDENDUM:
All answers were informative and are appreciated, but they seemed to shoot down my question rather than attempt to answer it. Kaleb was on the right track, but he chose to crunch the numbers for memory space instead of disk space (my question was for disk space).
Because programs only "pay" for the portions of static libraries that they use, it seems practically impossible to quantitatively know what the disk space difference would be for all static vs all shared.
I feel like trashing my question now that I realize it's practically impossible to answer. But I'll leave it here to preserve the informative answers.
So that SO stops nagging me to choose an answer, I'm going to pick the most popular one (even if it sidesteps the question).
I'm not sure where you heard this, but reduced disk space is mostly a red herring as drive space approaches pennies per gigabyte. The real gain with shared libraries comes with security and bugfix updates for those libraries; applications using static libraries have to be individually rebuilt with the new libraries, whereas all apps using shared libraries can be updated at once by replacing only a few files.
Not only do shared libraries save disk space, they also save memory, and that's a lot more important. The prelinking step is important here... you can't share the memory pages between two instances of the same library unless they are loaded at the same address, and prelinking allows that to happen.
Shared libraries do not necessarily save disk space or memory.
When an application links to a static library, only those parts of the library that the application uses will be pulled into the application binary. The library archive (.a) contains object files (.o), and if they are well factored, the application will use less memory by only linking with the object files it uses. Shared libraries will contain the whole library on disk and in memory whether parts of it are used by applications or not.
For desktop and server systems, this is less likely to result in a win overall, but if you are developing embedded applications, it's worth trying static linking all the applications to see if that gives you an overall saving.
I was able to figure out a partial quantitative answer without having to do an obscene amount of work. Here is my (hair-brained) methodology:
1) Use the following command to generate a list of packages with their installed size and list of dependencies:
dpkg-query -Wf '${Package}\t${Installed-Size}\t${Depends}
2) Parse the results and build a map of statistics for each package:
struct PkgStats
{
PkgStats() : kbSize(0), dependantCount(0) {}
int kbSize;
int dependentCount;
};
typedef std::map<std::string, PkgStats> PkgMap;
Where dependentCount is the number of other packages that directly depend on that package.
Results
Here is the Top 20 list of packages with the most dependants on my system:
Package Installed KB # Deps Dup'd MB
libc6 10096 750 7385
python 624 112 68
libatk1.0-0 200 92 18
perl 18852 48 865
gconf2 248 34 8
debconf 988 23 21
libasound2 1428 19 25
defoma 564 18 9
libart-2.0-2 164 14 2
libavahi-client3 160 14 2
libbz2-1.0 128 12 1
openoffice.org-core 124908 11 1220
gcc-4.4-base 168 10 1
libbonobo2-0 916 10 8
cli-common 336 8 2
coreutils 12928 8 88
erlang-base 6708 8 46
libbluetooth3 200 8 1
dictionaries-common 1016 7 6
where Dup'd MB is the number of megabytes that would be duplicated if there was no sharing (= installed_size * (dependants_count - 1), for dependants_count > 1).
It's not surprising to see libc6 on top. :) BTW, I have a typical Ubuntu 9.10 setup with a few programming-related packages installed, as well as some GIS tools.
Some statistics:
Total installed packages: 1717
Average # of direct dependents: 0.92
Total duplicated size with no sharing (ignoring indirect dependencies): 10.25GB
Histogram of # of direct dependents (note logarithmic Y scale):
Note that the above totally ignores indirect dependencies (i.e. everything should be at least be indirectly dependent on libc6). What I really should have done is built a graph of all dependencies and use that as the basis for my statistics. Maybe I'll get around to it sometime and post a lengthy blog article with more details and rigor.
Ok, perhaps not an answer, but the memory savings is what I'd consider. The savings is going to be based on the number of times a library is loaded after the first application, so lets find out how much savings per library are on the system using a quick script:
#!/bin/sh
lastlib=""
let -i cnt=1
let -i size=0
lsof | grep 'lib.*\.so$' | awk '{print $9}' | sort | while read lib ; do
if [ "$lastlib" == "$lib" ] ; then
let -i cnt="$cnt + 1"
else
let -i size="`ls -l $lib | awk '{print $5}'`"
let -i savings="($cnt - 1) * $size"
echo "$lastlib: $savings"
let -i cnt=1
fi
lastlib="$lib"
done
That will give us savings per lib, as such:
...
/usr/lib64/qt4/plugins/crypto/libqca-ossl.so: 0
/usr/lib64/qt4/plugins/imageformats/libqgif.so: 540640
/usr/lib64/qt4/plugins/imageformats/libqico.so: 791200
...
Then, the total savings:
$ ./checker.sh | awk '{total = total + $2}END{print total}'
263160760
So, roughly speaking on my system I'm saving about 250 Megs of memory. Your mileage will vary.