Stack traces from node are sometimes truncated. How can I see the full error? - node.js

I have a route that (deliberately) crashes my node app. When I visit that route, I get a proper log of the crash:
/Users/me/Documents/myapp/routes/index.js:795
global.fakeMethod();
^
TypeError: global.fakeMethod is not a function
at null._onTimeout (/Users/me/Documents/myapp/routes/index.js:795:11)
at Timer.listOnTimeout (timers.js:92:15)
However when I run that same code under systemd, the error is truncated. It
May 17 10:03:56 a.myapp.com www[28766]: /var/www/myapp/routes/index.js:795
May 17 10:03:56 a.myapp.com systemd[1]: myapp.service: main process exited, code=exited, status=1/FAILURE
May 17 10:03:56 a.myapp.com systemd[1]: Unit myapp.service entered failed state.
May 17 10:03:56 a.myapp.com systemd[1]: myapp.service failed.
May 17 10:03:56 a.myapp.com systemd[1]: myapp.service holdoff time over, scheduling restart.
How can I make systemd / journald log the full error?
Update: testing with systemd-cat, I have made a multiline file and logging it works:
cat file.txt | systemd-cat
results in:
Mar 02 09:51:25 a.certsimple.com unknown[31600]: line one
Mar 02 09:51:25 a.certsimple.com unknown[31600]: line two
Mar 02 09:51:25 a.certsimple.com unknown[31600]: line three

My best bet is it has something to do with stderr/stdout not being flushed before your application terminates.
Is there any way to tell your application to print the stack trace with synchronous syslog protocol instead of printing on the stdout.

This is not a systemd issue. It's a [node issue](https://github.com/nodejs/node/issues/6456
): node's process.exit() will always exit ASAP. process.exitCode() will flush buffers.
See the main issue for node v6 at: https://github.com/nodejs/node/issues/6456
As a workaround I'm wrapping process.exit():
var wrap = require('lodash.wrap');
var log = console.log.bind(console)
var RESTART_FLUSH_DELAY = 3 * 1000
process.exit = wrap(process.exit, function(originalFunction) {
log('Waiting', RESTART_FLUSH_DELAY, 'for buffers to flush before restarting')
setTimeout(originalFunction, RESTART_FLUSH_DELAY)
});
process.exit(1);

Related

Elastisearch Enabling Remote Connection - Crashes AFTER Change*

I just installed filebeat, logstash, kibana and elasticsearch all running smoothly just to trial this product out for additional monthly reports/monitoring and noticed every time I try to change the "/etc/elasticsearch/elasticsearch.yml" config file for remote web access it'll basically crash the service every time I make the change.
Just want to say I'm new to the forum and this product, and my end goal for this question is to figure out how to allow remote connections to access elastisearch as I guinea pig and test without crashing elasticsearch.
For reference here is the error code when I run the 'sudo systemctl status elasticsearch' query:
Dec 30 07:27:37 ubuntu systemd[1]: Starting Elasticsearch...
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: ERROR: [1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch.
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: bootstrap check failure [1] of [1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.se>
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elasticsearch.log
Dec 30 07:27:53 ubuntu systemd[1]: elasticsearch.service: Main process exited, code=exited, status=78/CONFIG
Dec 30 07:27:53 ubuntu systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
Dec 30 07:27:53 ubuntu systemd[1]: Failed to start Elasticsearch.
Any help on this is greatly appreciated!

Silence sysstat-collect.service?

The sar tool collects load values every 10 minutes on my CentOS Linux release 8.5.2111
via the service sysstat-collect.service. It fills /var/log/messages with:
Dec 26 12:50:04 node systemd[1]: Starting system activity accounting tool...
Dec 26 12:50:04 node systemd[1]: sysstat-collect.service: Succeeded.
Dec 26 12:50:04 node systemd[1]: Started system activity accounting tool
Every 10 minutes. That's annoying, I want to silence it. Is it possible??
Thanks in advance
You can use logrotate to select what logs you want to keep or delete

Systemd "OnFailure=" not starting when binary or bash exits with an error code

So I have a systemd unit that needs to be monitored, restarted in case of a crash and also something done in case the unit fails. I'm working on an embedded system so this needs to be robust.
In my case we have a systemd service:
Description=Demo unit
Wants=multi-user.target
OnFailure=FailHandler#%N.service
[Service]
ExecStart=/bin/bash /home/root/demo.sh
Restart=on-failure
RestartSec=1
Type=simple
The bash I start:
echo "Started demo.sh"
current_date=`date`
sleep 10s
echo "${current_date} Demo was here" >> /home/root/demo.txt
exit 1
So far so good. The bash always exits with 1 afer 10 seconds and logs the time. The problem is that FailHandler is never called in that case. Now this is just a demo all of the applications are in C++ but the behavior is the same. Now if I manually set the wrong path to the bash file it unit fails but it starts the "OnFailure" part. Here's syslog output from having correct path:
2021-09-03T13:06:31.575094+00:00 hostname bash[1125]: Started demo.sh
2021-09-03T13:06:41.629450+00:00 hostname systemd[1]: demo.service: Main process exited, code=exited, status=1/FAILURE
2021-09-03T13:06:41.644681+00:00 hostname systemd[1]: demo.service: Failed with result 'exit-code'.
2021-09-03T13:06:41.818089+00:00 hostname systemd[1]: demo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:06:41.824005+00:00 hostname systemd[1]: demo.service: Scheduled restart job, restart counter is at 1.
2021-09-03T13:06:41.850933+00:00 hostname bash[1179]: Started demo.sh
2021-09-03T13:06:51.870376+00:00 hostname systemd[1]: demo.service: Main process exited, code=exited, status=1/FAILURE
2021-09-03T13:06:51.872611+00:00 hostname systemd[1]: demo.service: Failed with result 'exit-code'.
2021-09-03T13:06:52.117479+00:00 hostname systemd[1]: demo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:06:52.136102+00:00 hostname systemd[1]: demo.service: Scheduled restart job, restart counter is at 2.
2021-09-03T13:06:52.163865+00:00 hostname bash[1221]: Started demo.sh
Here's output from when path is incorrect:
2021-09-03T13:07:46.582269+00:00 hostnaem bash[1446]: /bin/bash: /ahome/root/daemo.sh: No such file or directory
2021-09-03T13:07:46.588715+00:00 hostnaem systemd[1]: daemo.service: Main process exited, code=exited, status=127/n/a
2021-09-03T13:07:46.590356+00:00 hostnaem systemd[1]: daemo.service: Failed with result 'exit-code'.
2021-09-03T13:07:46.694616+00:00 hostnaem systemd[1]: daemo.service: Service RestartSec=100ms expired, scheduling restart.
2021-09-03T13:07:46.701519+00:00 hostnaem systemd[1]: daemo.service: Scheduled restart job, restart counter is at 1.
2021-09-03T13:07:46.720879+00:00 hostnaem systemd[1]: daemo.service: Start request repeated too quickly.
2021-09-03T13:07:46.721405+00:00 hostnaem systemd[1]: daemo.service: Failed with result 'exit-code'.
2021-09-03T13:07:46.722723+00:00 hostnaem systemd[1]: daemo.service: Triggering OnFailure= dependencies.
2021-09-03T13:07:46.804815+00:00 hostnaem FailHandler.sh[1457]: Failed application: daemo
2021-09-03T13:07:46.822342+00:00 hostnaem bash[1457]: error: cannot stat /etc/logrotate.d/daemo: No such file or directory
2021-09-03T13:07:46.841577+00:00 hostnaem FailHandler.sh[1457]: ERROR: Failed logrotate for daemo crash
2021-09-03T13:07:46.977003+00:00 hostnaem systemd[1]: FailHandler#daemo.service: Succeeded.
I understand from the syslog that it starts the FailHandler whenever number of restarts reaches StartLimitBurst=1 within 100ms but is there a way that it starts anytime the application exits with an error code?
Thank you man. I took one look at the link you sent and it landed. The solution in my case was:
ExecStopPost=/bin/bash -c 'if [ "$$EXIT_STATUS" != 0 ]; then systemctl start FailHandler#%N.service; fi'

couchdb.service: Failed with result 'start-limit-hit'

After I installed couchdb, I could get the welcome information
$ curl localhost:5984
{"couchdb":"Welcome","version":"2.1.2","features":["scheduler"],"vendor":{"name":"The Apache Software Foundation"}}
But I can't check the status by systemctl
$ systemctl status couchdb.service
● couchdb.service
Loaded: not-found (Reason: No such file or directory)
Active: failed (Result: start-limit-hit) since 一 2018-12-03 14:52:14 CST; 6min ago
Main PID: 30946 (code=killed, signal=USR2)
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Unit entered failed state.
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Failed with result 'signal'.
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Service hold-off time over, scheduling restart.
12月 03 14:52:14 gpuhuawei systemd[1]: Stopped Apache CouchDB.
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Start request repeated too quickly.
12月 03 14:52:14 gpuhuawei systemd[1]: Failed to start Apache CouchDB.
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Unit entered failed state.
12月 03 14:52:14 gpuhuawei systemd[1]: couchdb.service: Failed with result 'start-limit-hit'.
12月 03 14:53:53 gpuhuawei systemd[1]: Stopped Apache CouchDB.
12月 03 14:53:53 gpuhuawei systemd[1]: Stopped Apache CouchDB.
When I run couchdb by command line, I got
$ couchdb
{"init terminating in do_boot",{{badmatch,{error,{bad_return,{{couch_app,start,[normal,["/etc/couchdb/default.ini","/etc/couchdb/local.ini"]]},{'EXIT',{{badmatch,{error,{error,eacces}}},[{couch_server_sup,start_server,1,[{file,"couch_server_sup.erl"},{line,56}]},{application_master,start_it_old,4,[{file,"application_master.erl"},{line,273}]}]}}}}}},[{couch,start,0,[{file,"couch.erl"},{line,18}]},{init,start_it,1,[]},{init,start_em,1,[]}]}}
[1] 2288 user-defined signal 2 couchdb
My work enviroment
$ uname -a
Linux gpuhuawei 4.15.0-34-generic #37~16.04.1-Ubuntu SMP Tue Aug 28 10:44:06 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
This is a bit late, but the "start-limit-hit" message is a red herring. I have seen something very similar with a Moodle installation using MySQL and it's actually saying that you (or the service start process) have tried to restart the database too many times or too soon after a failed attempt to start. Basically, this start-limit-hit message is saying "stop trying to do the same thing and expecting different results".
The actual issue will be further up in the syslog. Unhelpfully, service status does not return enough lines of the error messages to actually see what is wrong. Try a service start, and go and look in the actual syslog, and you will see the series of start attempts and a line just above each one hopefully will tell you the actual issue. In my case here, you can see that the problem is the mount point containing the database is missing - thanks, Azure. For one attempt of service start, it tries to start 5 times in quick succession, failing each time because the data dir was not mounted, and on the sixth it fails with the start-limit-hit.
Always back up your data/ and etc/ directories prior to upgrading CouchDB.
We recommend that you overwrite your etc/default.ini file with the version provided by the new release. New defaults sometimes contain mandatory changes to enable default functionality. Always places your customization in etc/local.ini or any etc/local.d/*.ini file.
(I was followed this and it worked)
https://docs.couchdb.org/en/3.0.0/install/upgrading.html

Error starting Apache 2.4.6 on OpenSuse 13.1

When I try to start the apache this happens:
Job for apache2.service failed. See ‘systemctl status apache2.service’ and ‘journalctl -xn’ for details.
system.ctl status apache2.service -l returns me this:
Mar 24 23:41:57 glauber-pc.site start_apache2[3249]: httpd2-prefork: Syntax error on line 179 of /etc/apache2/httpd.conf: Syntax error on line 102 of /etc/apache2/default-server.conf: Syntax error on line 1 of /etc/apache2/conf.d/mod_evasive.conf: Cannot load /usr/lib64/apache2/mod_evasive20.so into server: /usr/lib64/apache2/mod_evasive20.so: cannot open shared object file: No such file or directory
Mar 24 23:41:57 glauber-pc.site systemd[1]: apache2.service: main process exited, code=exited, status=1/FAILURE
Mar 24 23:41:57 glauber-pc.site systemd[1]: Failed to start The Apache Webserver.
Mar 24 23:41:57 glauber-pc.site systemd[1]: Unit apache2.service entered failed state.
Everything seems right in those lines, any clue what it can be?
Change the first line in
/etc/apache2/conf.d/mod_evasive.conf
to
LoadModule evasive20_module /usr/lib64/apache2/mod_evasive24.so
(...24.so instead of ...20.so)

Resources