Taskserver fails to start on NixOS

Taskserver fails to start on NixOS - nixos

I have a NixOS 17.03 server with the taskserver package. The taskserver service do not start anymore (it used to start, but I can't track the precise moment when it stopped working).
Here is the portion of my configuration.nix related to taskserver :
services.taskserver.enable = true;
services.taskserver.fqdn = config.networking.hostName;
services.taskserver.listenHost = config.networking.hostName;
services.taskserver.organisations.myorga.users = [ "henri" ];
Ande the error details :
systemctl status taskserver
● taskserver.service - Taskwarrior Server
Loaded: loaded (/nix/store/dy9rz3al85s6rxifrwqmm6sf3nsnb6wz-unit-taskserver.service/taskserver.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Sat 2017-05-27 19:52:27 CEST; 15min ago
Process: 5241 ExecStart=taskd server --ca.cert=/var/lib/taskserver/keys/ca.cert --ciphers= --client.allow= --client.deny= --confirmation=true --daemon=false --debug=false --extensions= --ip.log=false --log=- --queue.size=10 --request.limit=1048576 --server.cert=/var/lib/taskserver/keys/server.cert --server.crl=/var/lib/taskserver/keys/server.crl --server.key=/var/lib/taskserver/keys/server.key --server=myserver:53589 --trust=strict (code=exited, status=255)
Process: 5239 ExecStartPre=/nix/store/29h8k2nld3cwmvqiqml125jxm7ndl62j-unit-script/bin/taskserver-pre-start (code=exited, status=0/SUCCESS)
Main PID: 5241 (code=exited, status=255)
May 27 19:52:27 myserver systemd[1]: taskserver.service: Main process exited, code=exited, status=255/n/a
May 27 19:52:27 myserver systemd[1]: taskserver.service: Unit entered failed state.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Failed with result 'exit-code'.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Service hold-off time over, scheduling restart.
May 27 19:52:27 myserver systemd[1]: Stopped Taskwarrior Server.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Start request repeated too quickly.
May 27 19:52:27 myserver systemd[1]: Failed to start Taskwarrior Server.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Unit entered failed state.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Failed with result 'exit-code'.
journalctl :
May 27 19:52:27 myserver systemd[1]: Starting Initialize CA for TaskServer...
May 27 19:52:27 myserver systemd[1]: Started Initialize CA for TaskServer.
May 27 19:52:27 myserver systemd[1]: Starting Taskwarrior Server...
May 27 19:52:27 myserver systemd[1]: Started Taskwarrior Server.
May 27 19:52:27 myserver taskd[5241]: ERROR: Could not read include file '/nix/store/8g6zs5xf1yvbkv8nzjgjqc3zgwjfy8a8-taskdrc'.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Main process exited, code=exited, status=255/n/a
May 27 19:52:27 myserver systemd[1]: taskserver.service: Unit entered failed state.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Failed with result 'exit-code'.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Service hold-off time over, scheduling restart.
May 27 19:52:27 myserver systemd[1]: Stopped Taskwarrior Server.
May 27 19:52:27 myserver systemd[1]: taskserver-ca.service: Start request repeated too quickly.
May 27 19:52:27 myserver systemd[1]: Failed to start Initialize CA for TaskServer.
May 27 19:52:27 myserver systemd[1]: taskserver-ca.service: Unit entered failed state.
May 27 19:52:27 myserver systemd[1]: taskserver-ca.service: Failed with result 'start-limit-hit'.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Start request repeated too quickly.
May 27 19:52:27 myserver systemd[1]: Failed to start Taskwarrior Server.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Unit entered failed state.
May 27 19:52:27 myserver systemd[1]: taskserver.service: Failed with result 'exit-code'.
The /nix/store/8g6zs5xf1yvbkv8nzjgjqc3zgwjfy8a8-taskdrc does not exists indeed. I tried to clean the store, to update and rebuild the packages, and even to upgrade to nixos-unstable, to no avail.
Technical details
System: NixOS 17.03.1203.58e227052d (Gorilla)
Nix version: 1.11.8
Nixpkgs version: 17.03.1203.58e227052d
Sandboxing enabled: false

The latest 17.03 branch has no references to taskdrc, because that file was removed.
But, reading that commit we can see, that reference to taskdrc file is written to ${cfg.dataDir}/config file. So, most likely, you should remove that include /nix/store/...-tasdrc line from your config for service to start.
The reason, why you have experienced such a breakage is that Hydra (NixOS CI) doesn't assume currently, that NixOS is sometimes upgraded and such migrations should be tested too.

Related

Systemd does not activate service?

I need to leave a service on systemd running because it doesn't activate? For what reason this happens since I follow the recommendation of the documentation, below are the codes:
Code of the Service :
# Contents of /etc/systemd/system/quark.service
[Unit]
Description=Quark
After=network.target
[Service]
Type=simple
User=cto
ExecStart=/usr/local/bin/python3.9 /var/net/
Restart=always
[Install]
WantedBy=multi-user.target
Status Code :
● quark.service - Quark
Loaded: loaded (/etc/systemd/system/quark.service; enabled; vendor preset: en
Active: failed (Result: exit-code) since Mon 2021-06-21 15:20:34 UTC; 8s ago
Process: 1467 ExecStart=/usr/local/bin/python3.9 /var/net/ (code=exited, statu
Main PID: 1467 (code=exited, status=1/FAILURE)
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Main process exited, code=e
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Failed with result 'exit-co
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Service RestartSec=100ms ex
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Scheduled restart job, rest
Jun 21 15:20:34 webstrucs systemd[1]: Stopped Quark.
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Start request repeated too
Jun 21 15:20:34 webstrucs systemd[1]: quark.service: Failed with result 'exit-co
Jun 21 15:20:34 webstrucs systemd[1]: Failed to start Quark.

The ExecStart should be the command to be executed:
systemd manpages:
ExecStart=
Commands with their arguments that are executed when this service is started.
This stanza:
ExecStart=/usr/local/bin/python3.9 /var/net/
Should be:
ExecStart=/usr/local/bin/python3.9 path_to_python_script.py

Why is puppet server was running fine but failing after some time fails?

When I logged in to my VM (Ubuntu 18.04) for second time, its showing error:
# systemctl status puppetserver.service
puppetserver.service - puppetserver Service Loaded: loaded
(/lib/systemd/system/puppetserver.service; enabled; vendor preset: enabled)
**Active**: failed (Result: exit-code) since Wed 2019-10-02 11:42:52 UTC; 2min 31s ago
Process: 23034
ExecStart=/opt/puppetlabs/server/apps/puppetserver/bin/puppetserver
start (code=exited, status=1/FAILURE)
Oct 02 11:42:52 puppet-master systemd[1]: puppetserver.service:
Control process exited, code=exited status=1 Oct 02 11:42:52
puppet-master systemd[1]: puppetserver.service: Failed with result
'exit-code'. Oct 02 11:42:52 puppet-master systemd[1]: Failed to start
puppetserver Service. Oct 02 11:42:52 puppet-master systemd[1]:
puppetserver.service: Service hold-off time over, scheduling restart.
Oct 02 11:42:52 puppet-master systemd[1]: puppetserver.service:
Scheduled restart job, restart counter is at 5. Oct 02 11:42:52
puppet-master systemd[1]: Stopped puppetserver Service. Oct 02
11:42:52 puppet-master systemd[1]: puppetserver.service: Start request
repeated too quickly. Oct 02 11:42:52 puppet-master systemd[1]:
puppetserver.service: Failed with result 'exit-code'. Oct 02 11:42:52
puppet-master systemd[1]: Failed to start puppetserver Service.
Is there a way to identify the issue?

To troubleshoot more, you can run:
journalctl -xe -u puppetserver
And also check the /var/log/puppetlabs/puppetserver/puppetserver.log file for more info, but based on the error message:
Start request repeated too quickly
It seems that you didn't have patience to restart (as it can take few minutes).
The best to restart I recommend to you:
systemctl restart puppet*
systemctl restart puppet pxp-agent
In this way, the services manage the dependency between all puppet* services.
Note: Server Fault, part of Stack Exchange (includes Stack Overflow) provides support for managing information technology systems in a business environment.

unit falling into a failed state (status=143) when stopping service [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
here is my problem. I have CentOS and java process running on it. Java process is operated by the start/stop script. It creates a .pid file of java instance too.
My unit file is looking like:
[Unit]
After=syslog.target network.target
Description=Someservice
[Service]
User=xxxuser
Type=forking
WorkingDirectory=/srv/apps/someservice
ExecStart=/srv/apps/someservice/server.sh start
ExecStop=/srv/apps/someservice/server.sh stop
PIDFile=/srv/apps/someservice/application.pid
TimeoutStartSec=0
[Install]
WantedBy=multi-user.target
When I call stop function, script terminates java process with SIGTERM and returns 0 code:
kill $OPT_FORCEKILL `cat $PID_FILE`
<...>
return 0
After that, if I check the status of my unit, I get something like that (status=143):
● someservice.service - Someservice
Loaded: loaded (/usr/lib/systemd/system/someservice.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2017-08-30 09:17:40 EEST; 4s ago
Process: 48365 ExecStop=/srv/apps/someservice/server.sh stop (code=exited, status=0/SUCCESS)
Main PID: 46115 (code=exited, status=143)
Aug 29 17:10:02 whatever.domain.com systemd[1]: Starting Someservice...
Aug 29 17:10:02 whatever.domain.com systemd[1]: PID file /srv/apps/someservice/application.pid not readable (yet?) after start.
Aug 29 17:10:04 whatever.domain.com systemd[1]: Started Someservice.
Aug 30 09:17:39 whatever.domain.com systemd[1]: Stopping Someservice...
Aug 30 09:17:39 whatever.domain.com server.sh[48365]: Stopping someservice - PID [46115]
Aug 30 09:17:40 whatever.domain.com systemd[1]: someservice.service: main process exited, code=exited, status=143/n/a
Aug 30 09:17:40 whatever.domain.com systemd[1]: Stopped Someservice.
Aug 30 09:17:40 whatever.domain.com systemd[1]: Unit someservice.service entered failed state.
Aug 30 09:17:40 whatever.domain.com systemd[1]: someservice.service failed.
When I haven't got the return value in my start/stop script, it acts absolutely the same.
Adding into the unit file something like:
[Service]
SuccessExitStatus=143
is not good idea for me. Why the systemctl acting so and doesn't show me normal service state?
When I try to modify my start/stop script and instead of return 0 I put return 10 it acts the same, but I can see that exit 10 is passed.
Here is an example:
● someservice.service - Someservice
Loaded: loaded (/usr/lib/systemd/system/someservice.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2017-08-30 09:36:22 EEST; 5s ago
Process: 48460 ExecStop=/srv/apps/someservice/server.sh stop (code=exited, status=10)
Process: 48424 ExecStart=/srv/apps/someservice/server.sh start (code=exited, status=0/SUCCESS)
Main PID: 48430 (code=exited, status=143)
Aug 30 09:36:11 whatever.domain.com systemd[1]: Starting Someservice...
Aug 30 09:36:11 whatever.domain.com systemd[1]: PID file /srv/apps/someservice/application.pid not readable (yet?) after start.
Aug 30 09:36:13 whatever.domain.com systemd[1]: Started Someservice.
Aug 30 09:36:17 whatever.domain.com systemd[1]: Stopping Someservice...
Aug 30 09:36:17 whatever.domain.com server.sh[48460]: Stopping someservice - PID [48430]
Aug 30 09:36:21 whatever.domain.com systemd[1]: someservice.service: main process exited, code=exited, status=143/n/a
Aug 30 09:36:22 whatever.domain.com systemd[1]: someservice.service: control process exited, code=exited status=10
Aug 30 09:36:22 whatever.domain.com systemd[1]: Stopped Someservice.
Aug 30 09:36:22 whatever.domain.com systemd[1]: Unit someservice.service entered failed state.
Aug 30 09:36:22 whatever.domain.com systemd[1]: someservice.service failed.
From the journalctl log I can see that systemctl firstly returns the status=143 and then my return value of 10. So i guess that my mistake is somewhere in start/stop script (because error code 143 is passed before function returns 0)?

You should be able to suppress this by adding the exit code into the unit file as a "success" exit status:
[Service]
SuccessExitStatus=143
source

calico-node rkt returns stage1-fly.aci.asc: no such file or directory

I have a CoreOS beta (1185.2.0) installed.
I have the following systemd service file to start calico-node:
[Unit]
Description=Calico per-host agent
Requires=network-online.target
After=network-online.target
[Service]
Slice=machine.slice
PermissionsStartOnly=true
Environment=ETCD_CA_CERT_FILE=/etc/ssl/etcd/ca.pem
Environment=ETCD_CERT_FILE=/etc/ssl/etcd/etcd1.pem
Environment=ETCD_KEY_FILE=/etc/ssl/etcd/etcd1-key.pem
Environment=CALICO_DISABLE_FILE_LOGGING=true
Environment=HOSTNAME=10.79.218.2
Environment=IP=10.79.218.2
Environment=FELIX_FELIXHOSTNAME=10.79.218.2
Environment=CALICO_NETWORKING=true
Environment=NO_DEFAULT_POOLS=true
Environment=ETCD_ENDPOINTS=https://coreos-2.tux-in.com:2379,https://coreos-3.tux-in.com:2379
ExecStartPre=/bin/mkdir /var/run/calico
ExecStart=/usr/bin/rkt run --inherit-env --stage1-from-dir=stage1-fly.aci --volume=var-run-calico,kind=host,source=/var/run/calico --volume=modules,kind=host,source=/lib/modules,readOnly=false --mount=volume=modules,target=/lib/modules --volume=dns,kind=host,source=/etc/resolv.conf,readOnly=true --volume=etcd-tls-certs,kind=host,source=/etc/ssl/etcd,readOnly=true --mount=volume=dns,target=/etc/resolv.conf --mount=volume=etcd-tls-certs,target=/etc/ssl/etcd --mount=volume=var-run-calico,target=/var/run/calico --trust-keys-from-https quay.io/calico/node:v0.22.0
KillMode=mixed
Restart=always
TimeoutStartSec=0
[Install]
WantedBy=multi-user.target
welp.. the systemd fails with:
● calico-node.service - Calico per-host agent
Loaded: loaded (/etc/systemd/system/calico-node.service; enabled; vendor preset: disabled)
Active: failed (Result: start-limit-hit) since Tue 2016-10-25 04:51:15 UTC; 9min ago
Process: 1970 ExecStart=/usr/bin/rkt run --inherit-env --stage1-from-dir=stage1-fly.aci --volume=var-run-calico,kind=host,source=/var/
Process: 4307 ExecStartPre=/bin/mkdir /var/run/calico (code=exited, status=1/FAILURE)
Main PID: 1970 (code=exited, status=1/FAILURE)
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: Failed to start Calico per-host agent.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Unit entered failed state.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Failed with result 'exit-code'.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Service hold-off time over, scheduling restart.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: Stopped Calico per-host agent.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Start request repeated too quickly.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: Failed to start Calico per-host agent.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Unit entered failed state.
Oct 25 04:51:15 coreos-2.tux-in.com systemd[1]: calico-node.service: Failed with result 'start-limit-hit'.
I tried setting the environment variables on terminal and running the rkt command and I got the error message
image: using image from file /usr/lib/rkt/stage1-images/stage1-fly.aci
run: open /usr/lib/rkt/stage1-images/stage1-fly.aci.asc: no such file or directory
I think that error may relate to the following configuration file at /etc/rkt/paths.d/paths.json
{
"rktKind": "paths",
"rktVersion": "v1",
"stage1-images": "/usr/lib/rkt/stage1-images"
}
I need the paths configuration file later on for kubernetes.
any ideas? the asc file really doesn't exist there.

/usr/lib is a dynamic link to /usr/lib64. rkt configured there not to search for certificates for container images at /usr/lib64 and not /usr/lib.
it seems that by default this configuration is already set properly, so just removing the file /etc/rkt/paths.d/paths.json resolves the issue.
full answer at https://github.com/coreos/rkt/issues/3320

Can not start keystone service

I installed packstack on my fresh installation of Fedora 21 with all updates. When I run
packstack --allinone I received this error:
ERROR : Error appeared during Puppet run: 192.168. 1.*_keystone.pp Error:
Could not start Service[keystone]: Execution of '/sbin/service openstack-keystone
start'` returned 1: Redirecting to /bin/systemctl start openstack-keystone.service
You will find full trace in log /var/tmp/packstack/20141223-022613-whLvTs/manifests
/192.168.1.*_keystone.pp.log
And this is the log:
Notice: /Stage[main]/Cinder::Keystone::Auth/Keystone_user_role[cinder#services]:
Dependency Service[keystone] has failures: true
Warning: /Stage[main]/Cinder::Keystone::Auth/Keystone_user_role[cinder#services]:
Skipping because of failed dependencies
Notice: Finished catalog run in 13.02 seconds
With systemctl status openstack-keystone.service get this:
openstack-keystone.service - OpenStack Identity Service (code-named Keystone)
Loaded: loaded (/usr/lib/systemd/system/openstack-keystone.service; disabled)
Active: failed (Result: start-limit) since Tue 2014-12-23 19:47:36 EET; 1min 59s ago
Process: 22526 ExecStart=/usr/bin/keystone-all (code=exited, status=1/FAILURE)
Main PID: 22526 (code=exited, status=1/FAILURE)
Dec 23 19:47:35 localhost.localdomain systemd[1]: Failed to start OpenStack...
Dec 23 19:47:35 localhost.localdomain systemd[1]: Unit openstack-keystone.s...
Dec 23 19:47:35 localhost.localdomain systemd[1]: openstack-keystone.servic...
Dec 23 19:47:36 localhost.localdomain systemd[1]: start request repeated to...
Dec 23 19:47:36 localhost.localdomain systemd[1]: Failed to start OpenStack...
Dec 23 19:47:36 localhost.localdomain systemd[1]: Unit openstack-keystone.s...
Dec 23 19:47:36 localhost.localdomain systemd[1]: openstack-keystone.servic...

This can happen due SELinux avc denial because of a missing policy.
You can try to put SELinux to permissive mode:
# setenforce 0
A similar bug

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Taskserver fails to start on NixOS - nixos

Related

Systemd does not activate service?

Why is puppet server was running fine but failing after some time fails?

unit falling into a failed state (status=143) when stopping service [closed]

calico-node rkt returns stage1-fly.aci.asc: no such file or directory

Can not start keystone service

Categories

Resources