Cannot install freeradius server installation exits with code (1) - linux

Trying to install freeradius package on Debian 10 buster and it fails.
$ sudo apt install freeradius
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Suggested packages:
freeradius-krb5 freeradius-ldap freeradius-mysql freeradius-postgresql freeradius-python3
The following NEW packages will be installed:
freeradius
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 555 kB of archives.
After this operation, 2230 kB of additional disk space will be used.
Get:1 http://ftp.de.debian.org/debian sid/main amd64 freeradius amd64 3.0.21+dfsg-2+b2 [555 kB]
Fetched 555 kB in 0s (2753 kB/s)
Selecting previously unselected package freeradius.
(Reading database ... 140557 files and directories currently installed.)
Preparing to unpack .../freeradius_3.0.21+dfsg-2+b2_amd64.deb ...
Unpacking freeradius (3.0.21+dfsg-2+b2) ...
Setting up freeradius (3.0.21+dfsg-2+b2) ...
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
install: invalid user ‘freerad’
dpkg: error processing package freeradius (--configure):
installed freeradius package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
freeradius
E: Sub-process /usr/bin/dpkg returned an error code (1)
install: invalid user ‘freerad’
Apparently it seems as if there's something wrong with the freerad user which doesn't exist?
Typing / checking inside /etc/passwd file learly shows there's no such user
$ cat /etc/passwd | grep freer*
Checking the syslog shows:
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Scheduled restart job, restart counter is at 277.
Feb 10 00:38:45 server-1 freeradius[10918]: FreeRADIUS Version 3.0.21
Feb 10 00:38:45 server-1 freeradius[10918]: Copyright (C) 1999-2019 The FreeRADIUS server project and contributors
Feb 10 00:38:45 server-1 freeradius[10918]: There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
Feb 10 00:38:45 server-1 freeradius[10918]: PARTICULAR PURPOSE
Feb 10 00:38:45 server-1 freeradius[10918]: You may redistribute copies of FreeRADIUS under the terms of the
Feb 10 00:38:45 server-1 freeradius[10918]: GNU General Public License
Feb 10 00:38:45 server-1 freeradius[10918]: For more information about these matters, see the file named COPYRIGHT
Feb 10 00:38:45 server-1 freeradius[10918]: Errors reading /etc/freeradius/3.0: Permission denied
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Control process exited, code=exited, status=1/FAILURE
Feb 10 00:38:45 server-1 systemd[1]: freeradius.service: Failed with result 'exit-code'.
Tried a**smarting this by adding the user manually with the command adduser freeradius
and then tried re-installing:
$ sudo apt install freeradius
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
freeradius is already the newest version (3.0.21+dfsg-2+b2).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
After this operation, 0 B of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up freeradius (3.0.21+dfsg-2+b2) ...
update-rc.d: warning: start and stop actions are no longer supported; falling back to defaults
chown: cannot access '/etc/freeradius': No such file or directory
dpkg: error processing package freeradius (--configure):
installed freeradius package post-installation script subprocess returned error exit status 1
Errors were encountered while processing:
freeradius
E: Sub-process /usr/bin/dpkg returned an error code (1)
now it fails prompting a new error: chown: cannot access '/etc/freeradius': No such file or directory
Tried purging / removing / clean / cache deleting / rebooting and then restarting the installation but it keeps reappearing.
Let me know if there's more info I can provide in manner to help you help me.
Cheers.

when removing freeradius (either by apt remove freeradius or apt purge freeradius) mind that config files are a seperate package called freeradius-config
so to completely wipe freeradius and do a reinstall do:
apt purge freeradius freeradius-config
rm -rf /etc/freeradius/
apt install freeradius

After installation of freeradius, add the following lines at the end of the client.conf file to set the access of your client:
#vi /etc/freeradius/3.0/clients.conf
For Example :
client SWITCH-01 {
ipaddr = 192.168.0.10
secret = kamisama123
}
client LINUX-01 {
ipaddr = 192.168.0.20
secret = vegeto123
}
In our example, we are adding 2 client devices, both devices will offer a login promt to authenticate on freeradius server database.
Now, locate and edit the freeradius users configuration file:
Example:
# vi /etc/freeradius/3.0/users
Adding Your Username And Password Like below:
freerad1 Cleartext-Password := "freerad1234"
freerad2 Cleartext-Password := "freerad2123"
Now, Restart freeradius server :
# service freeradius restart
Now, test your radius server configuration file :
#freeradius -CX
you can Test your radius authentication locally on the Radius server using the following commands:
# radtest freerad1 freerad1234 localhost 0 testing123
Here is an example of a successful radius authentication:
Sent Access-Request Id 151 from 0.0.0.0:34857 to 127.0.0.1:1812 length 75
User-Name = "freerad1"
User-Password = "freerad1234"
NAS-IP-Address = 172.31.41.98
NAS-Port = 0
Message-Authenticator = 0x00
Cleartext-Password = "freerad1234"
Received Access-Accept Id 151 from 127.0.0.1:1812 to 0.0.0.0:0 length 20
We are using the freerad1 username and the freerad1234 password to authenticate the user account.
The testing123 is a default device password included in the clients.conf file.
Now, go to the Linux server included on the clients.conf configuration file as LINUX-01.
Install the freeradius-utils package.
# apt-get install freeradius-utils
Test your radius authentication remotely on the Linux server (LINUX-01 OR YOUR CLIENT YOU ARE ADDED ON client.conf) using the following commands:
# radtest freerad freerad1234 192.168.0.50 0 vegeto123
** Congratulations, Your Freeradius server is on the service **

Related

How is ssh2 installed for PHP 7.4 on Rocky Linux?

I recently upgraded to Rocky. I am getting an error on a PHP SSH command that has worked for years. Everything I try fails for various reasons. I need to ssh from PHP. At this point I am not particular about the method used. This is the code used for ssh connections:
$ssh = ssh2_connect($ssh_domain);
if( ssh2_auth_password($ssh_user, $ssh_password) )
{
// send the command
}
but I get this error on the connect line:
Call to undefined function ssh2_connect()
I installed with
yum install libssh2-devel
Last metadata expiration check: 0:02:00 ago on Wed 03 Aug 2022 12:50:28 PM EDT.
Dependencies resolved.
====================================================================================================================================================
Package Architecture Version Repository Size
====================================================================================================================================================
Installing:
libssh2-devel x86_64 1.9.0-5.el8 epel 68 k
Transaction Summary
====================================================================================================================================================
Install 1 Package
Total download size: 68 k
Installed size: 288 k
Is this ok [y/N]: y
Downloading Packages:
libssh2-devel-1.9.0-5.el8.x86_64.rpm 309 kB/s | 68 kB 00:00
----------------------------------------------------------------------------------------------------------------------------------------------------
Total 141 kB/s | 68 kB 00:00
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
Preparing : 1/1
Installing : libssh2-devel-1.9.0-5.el8.x86_64 1/1
Running scriptlet: libssh2-devel-1.9.0-5.el8.x86_64 1/1
Verifying : libssh2-devel-1.9.0-5.el8.x86_64 1/1
Installed:
libssh2-devel-1.9.0-5.el8.x86_64
Complete!
and then I restarted apache.
What am I doing wrong? Did I install the wrong package? Maybe I need to install something else?
Thanks for the help.

/var/lib/tor cannot be read: Permission denied or Couldn't create private data directory

I use google cloud shell to execute this program
Linux version
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
Tor version 0.3.5.10.
When I tried restarting "sudo service tor restart" Tor I received an error
[ ok ] Stopping tor daemon...done (not running - there is no /run/tor/tor.pid).
[....] Starting tor daemon...Jun 27 01:51:04.132 [warn] Directory /var/lib/tor cannot be read: Permission denied
Jun 27 01:51:04.132 [warn] Failed to parse/validate config: Couldn't create private data directory "/var/lib/tor"
Jun 27 01:51:04.132 [err] Reading config failed--see warnings above.
failed.
So I set full permissions for the tor directory sudo chmod -R 777 /var/lib/tor
[FAIL] Checking if tor configuration is valid ... failed!
Jun 27 01:53:59.685 [notice] Tor 0.3.5.10 running on Linux with Libevent 2.1.8-stable, OpenSSL 1.1.1g, Zlib 1.2.11, Liblzma 5.2.4, and Libzstd 1.3.8.
Jun 27 01:53:59.685 [notice] Tor can't help you if you use it wrong! Learn how to be safe at https://www.torproject.org/download/download#warning
Jun 27 01:53:59.685 [notice] Read configuration file "/usr/share/tor/tor-service-defaults-torrc".
Jun 27 01:53:59.685 [notice] Read configuration file "/etc/tor/torrc".
Jun 27 01:53:59.688 [warn] Error setting groups to gid 114: "Operation not permitted".
Jun 27 01:53:59.688 [warn] If you set the "User" option, you must start Tor as root.
Jun 27 01:53:59.688 [warn] Failed to parse/validate config: Problem with User value. See logs for details.
Jun 27 01:53:59.688 [err] Reading config failed--see warnings above.
I use root privileges sudo su
[ ok ] Stopping tor daemon...done (not running - there is no /run/tor/tor.pid).
[....] Starting tor daemon...Jun 27 01:58:58.455 [warn] Directory /var/lib/tor cannot be read: Permission denied
Jun 27 01:58:58.455 [warn] Failed to parse/validate config: Couldn't create private data directory "/var/lib/tor"
Jun 27 01:58:58.455 [err] Reading config failed--see warnings above.
Is there any way that can help me solve my problem or how can i be able to install tor version 2.9.14?
You might have already solved the problem by now, if not I hope this can help.
Is there any way that can help me solve my problem?
OPTION 1
Let's take a look at these warnings:
[warn] Error setting groups to gid 114: "Operation not permitted".
[warn] If you set the "User" option, you must start Tor as root.
[warn] Failed to parse/validate config: Problem with User value.
To get a log of all users run cat /etc/passwd and you'll see debian-tor listed:
...
debian-tor:x:108:114::/var/lib/tor:/bin/false
...
The folder /var/lib/tor is owned by user debian-tor, so sudo -u debian-tor tor will work.
Alternatively, you can run this for your current user: (or chmod 777 for all)
chmod 700 -R /var/lib/tor/*
chown -R tor /var/lib/tor/
sudo service tor restart
You actually should run tor as non-root, else you get this message:
You are running Tor as root. You don't need to, and you probably shouldn't.
OPTION 2
As the warning suggests to see logs for details you should check for a message within dsmeg and /var/log/syslog. If you find anything then it can be AppArmor or SELinux blocking tor. Both SELinux and AppArmor provide a set of tools to isolate applications from each other to protect the host system from being compromised, so it's not recommended disabling them permanently but temporarily for debugging.
According to Debian SELinux support:
The Debian packaged Linux kernels have SELinux support compiled in,
but disabled by default.
Check the SELinux state with getenforce, if the output is Permissive or Disabled then you're set.
Moreover, looking at AppArmor/Progress:
Since Debian 10 (Buster), AppArmor is enabled by default.
To disable AppArmor on your system run: (reference)
sudo mkdir -p /etc/default/grub.d
echo 'GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT apparmor=0"' \
| sudo tee /etc/default/grub.d/apparmor.cfg
sudo update-grub
sudo reboot
There's a chance that either one's the culprit. Users have reported similar issue here.
How can i be able to install tor version 2.9.14?
Downgrading the tor package is as simple as this:
sudo apt-get install tor=0.2.9.14
But why would you want do that?
tor v2 will be deprecated soon. You'll see warnings like:
[warn] At least one protocol listed as required in the consensus is
not supported by this version of Tor. You should upgrade. This version
of Tor will not work as a client on the Tor network. The missing
protocols are: DirCache=2 HSDir=2 HSIntro=4 Link=4-5
NB: Post on tor.stackexchange for tor related issues.

Docker - Cannot build multi-platform images with docker buildx

I'm trying to build a multi-platform (amd64, arm64 and armv7) image using docker buildx. Since I'm using an amd64 machine running Ubuntu 18.04, I followed the instructions on the Docker website and installed qemu via:
sudo apt install qemu-user
However, a weird error appears when I execute the previous command. More specifically, there seems to be an issue with the binfmt-support service. Here's the full log:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 0
Starting 2 pkgProblemResolver with broken count: 0
Done
The following additional packages will be installed:
binfmt-support qemu-user-binfmt
The following NEW packages will be installed:
binfmt-support qemu-user qemu-user-binfmt
0 upgraded, 3 newly installed, 0 to remove and 1 not upgraded.
Need to get 0 B/7.409 kB of archives.
After this operation, 63,4 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Selecting previously unselected package binfmt-support.
(Reading database ... 245278 files and directories currently installed.)
Preparing to unpack .../binfmt-support_2.1.8-2_amd64.deb ...
Unpacking binfmt-support (2.1.8-2) ...
Selecting previously unselected package qemu-user.
Preparing to unpack .../qemu-user_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Selecting previously unselected package qemu-user-binfmt.
Preparing to unpack .../qemu-user-binfmt_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Setting up binfmt-support (2.1.8-2) ...
Job for binfmt-support.service failed because the control process exited with error code.
See "systemctl status binfmt-support.service" and "journalctl -xe" for details.
invoke-rc.d: initscript binfmt-support, action "start" failed.
● binfmt-support.service - Enable support for additional executable binary formats
Loaded: loaded (/lib/systemd/system/binfmt-support.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Wed 2020-02-05 17:20:29 CET; 4ms ago
Docs: man:update-binfmts(8)
Process: 7766 ExecStart=/usr/sbin/update-binfmts --enable (code=exited, status=2)
Main PID: 7766 (code=exited, status=2)
feb 05 17:20:29 XPS-15-9570 systemd[1]: Starting Enable support for additional executable binary formats...
feb 05 17:20:29 XPS-15-9570 update-binfmts[7766]: update-binfmts: warning: unable to close /proc/sys/fs/binfmt_misc/register: No such file or directory
feb 05 17:20:29 XPS-15-9570 update-binfmts[7766]: update-binfmts: exiting due to previous errors
feb 05 17:20:29 XPS-15-9570 systemd[1]: binfmt-support.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
feb 05 17:20:29 XPS-15-9570 systemd[1]: binfmt-support.service: Failed with result 'exit-code'.
feb 05 17:20:29 XPS-15-9570 systemd[1]: Failed to start Enable support for additional executable binary formats.
Setting up qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Setting up qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
update-binfmts: warning: current package is qemu-user-binfmt, but binary format already installed by qemu-user-static
update-binfmts: exiting due to previous errors
dpkg: error processing package qemu-user-binfmt (--configure):
installed qemu-user-binfmt package post-installation script subprocess returned error exit status 2
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for neon-settings (0.0+p18.04+git20191212.1343) ...
Processing triggers for systemd (237-3ubuntu10.33) ...
Errors were encountered while processing:
qemu-user-binfmt
E: Sub-process /usr/bin/dpkg returned an error code (1)
Despite that, I tried to go on with the usual procedure, namely:
docker buildx create --name mybuilder
docker buildx use mybuilder
docker buildx inspect --bootstrap
Where the output of the last command is:
[+] Building 5.0s (1/1) FINISHED
=> [internal] booting buildkit 5.0s
=> => pulling image moby/buildkit:buildx-stable-1 4.3s
=> => creating container buildx_buildkit_mybuilder0 0.7s
Name: mybuilder
Driver: docker-container
Nodes:
Name: mybuilder0
Endpoint: unix:///var/run/docker.sock
Status: running
Platforms: linux/amd64, linux/386
As you can see, "linux/amd64" and "linux/386" are listed as the only available platforms, however I would need to build the image for "linux/arm64" and "linux/arm/v7" platforms as well.
I've been looking for a solution to this problem for hours, though I didn't find anything that worked
------------------------------------ EDIT ------------------------------------
Looks like I was able to solve part of the issue by running:
sudo apt purge --auto-remove qemu-user qemu-user-binfmt binfmt-support
And then reinstalling them. In fact, running again this command:
sudo apt install qemu-user
gives no error at all:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 0
Starting 2 pkgProblemResolver with broken count: 0
Done
The following additional packages will be installed:
binfmt-support qemu-user-binfmt
The following NEW packages will be installed:
binfmt-support qemu-user qemu-user-binfmt
0 upgraded, 3 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/7.409 kB of archives.
After this operation, 63,4 MB of additional disk space will be used.
Do you want to continue? [Y/n]
Selecting previously unselected package binfmt-support.
(Reading database ... 245437 files and directories currently installed.)
Preparing to unpack .../binfmt-support_2.1.8-2_amd64.deb ...
Unpacking binfmt-support (2.1.8-2) ...
Selecting previously unselected package qemu-user.
Preparing to unpack .../qemu-user_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Selecting previously unselected package qemu-user-binfmt.
Preparing to unpack .../qemu-user-binfmt_1%3a2.11+dfsg-1ubuntu7.21_amd64.deb ...
Unpacking qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Setting up binfmt-support (2.1.8-2) ...
Created symlink /etc/systemd/system/multi-user.target.wants/binfmt-support.service → /lib/systemd/system/binfmt-support.service.
Setting up qemu-user (1:2.11+dfsg-1ubuntu7.21) ...
Setting up qemu-user-binfmt (1:2.11+dfsg-1ubuntu7.21) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...
Processing triggers for ureadahead (0.100.0-21) ...
Processing triggers for neon-settings (0.0+p18.04+git20191212.1343) ...
Processing triggers for systemd (237-3ubuntu10.38) ...
Similarly, the output of systemctl status binfmt-support.service is as expected:
● binfmt-support.service - Enable support for additional executable binary formats
Loaded: loaded (/lib/systemd/system/binfmt-support.service; enabled; vendor preset: enabled)
Active: active (exited) since Mon 2020-02-10 11:42:23 CET; 1min 11s ago
Docs: man:update-binfmts(8)
Main PID: 7161 (code=exited, status=0/SUCCESS)
Tasks: 0 (limit: 4915)
CGroup: /system.slice/binfmt-support.service
feb 10 11:42:23 XPS-15-9570 systemd[1]: Starting Enable support for additional executable binary formats...
feb 10 11:42:23 XPS-15-9570 systemd[1]: Started Enable support for additional executable binary formats.
However, part of the issue is still there, as the output after running these three commands:
docker buildx create --name mybuilder
docker buildx use mybuilder
docker buildx inspect --bootstrap
is the same as before, namely:
[+] Building 2.6s (1/1) FINISHED
=> [internal] booting buildkit 2.6s
=> => pulling image moby/buildkit:buildx-stable-1 2.0s
=> => creating container buildx_buildkit_mybuilder0 0.6s
Name: mybuilder
Driver: docker-container
Nodes:
Name: mybuilder0
Endpoint: unix:///var/run/docker.sock
Status: running
Platforms: linux/amd64, linux/386
Why is that? Why is it showing me linux/amd64 and linux/386 as the only available platforms?
EDIT #2 (concerning #LinPy's comment)
The output of docker context ls is:
NAME DESCRIPTION DOCKER ENDPOINT KUBERNETES ENDPOINT ORCHESTRATOR
default * Current DOCKER_HOST based configuration unix:///var/run/docker.sock swarm
I've also tried to restart docker after qemu's installation, but to no success. Also, specifying the target platforms in the docker buildx command:
docker buildx build -t <mytag> --platform linux/amd64,linux/arm64,linux/arm/v7 --load .
results in this error:
[+] Building 0.6s (5/20)
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 32B 0.0s
=> [linux/arm/v7 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
=> CANCELED [linux/arm64 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
=> CANCELED [linux/amd64 internal] load metadata for docker.io/alegeno92/opencv_python3:3.4.2 0.6s
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to load LLB: runtime execution on platform linux/arm/v7 not supported
By the way, my version of the kernel is 4.15.0-76-generic
Run the multiarch container first
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
docker buildx rm builder
docker buildx create --name builder --driver docker-container --use
docker buildx inspect --bootstrap
And you should have your alternate architectures.
Tagging on this answer in response to the first error. The commands have been updated per https://docs.docker.com/buildx/working-with-buildx/.
QEMU is a cross-platform emulator responsible for sourcing the binaries for different architectures (through the binfmt_misc handler).
Will save some people some time to start with this command first:
docker run --privileged --rm tonistiigi/binfmt --install all
There are multiple binfmt packages, and there's a configuration that I think was missed when this question was asked.
For the various packages, I would opt for qemu-user-static over qemu-user-binfmt to avoid any dynamic linking issues. The two packages are doing the same thing, so you'll need to pick one or the other.
The next part should be fixed in current releases, but I think you were stumbling on this before. That's the fix binary or F flag you'll see when catting the files in /proc/sys/fs/binfmt_misc, e.g. see the F flag here:
$ cat /proc/sys/fs/binfmt_misc/qemu-arm
enabled
interpreter /usr/libexec/qemu-binfmt/arm-binfmt-P
flags: POCF
offset 0
magic 7f454c4601010100000000000000000002002800
mask ffffffffffffff00fffffffffffffffffeffffff
Details on what the F flag means can be found on this kernel.org post but the short of it is container namespaces include a different filesystem namespace, and trying to access the interpreter from that namespace will fail (unless you do something like bind mount /usr/libexec/qemu-binfmt into your container). Newer versions of the qemu packages automatically set this flag, so if your flags section doesn't have the F defined, see these bug reports for the version you'll need to upgrade to:
Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=868030
Ubuntu: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815100
The easy button is to use the binaries from the multiarch image. This is good in CI if you have a dedicated VM (less ideal if you are modifying the host used by other builds). However if you reboot, it breaks until you run the container again. And it requires you to remember to update it for any upstream patches. So I wouldn't recommend it for a long running build host.
docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
For github CI, add following plugin solve this for me
- name: Set up QEMU
id: qemu
uses: docker/setup-qemu-action#v1
with:
image: tonistiigi/binfmt:latest
platforms: all

Slurm and Munge "Invalid Credential"

I'm installing slurm for the first time. I've installed the 19.05.1-2 tarball and used the configurator to make a very simple two node cluster. Control node is sdc, compute nodes (running slurmd) are sdc and sdc1. Both rebuilt with Ubuntu 18.04
I can start the controller, and the compute node sdc and also successfully submit jobs with srun. That's great. However, when I start slurmd on the second node, SDC1, I get:
slurmd: error: Unable to register: Zero Bytes were transmitted or received
That quickly led me to my munge configuration. Munge.log on the controller (sdc) shows "Invalid credential" every second. I triple checked that munge.key on both hosts are identical. I verified that ntp is running too.
So by hand I did munge -s foobar | unmunge on SDC1 and of course that worked locally. Then I saved the munged text from SDC1 to a file on SDC and tried unmunge. That did give me the error "Invalid credential" again.
Because of this I uninstalled and reinstalled munge on both systems, distributed the key and repeated that test with the same result.
I guess I'm missing something simple. I don't know what else to do to properly install munge.
It was UID/GID mismatch between nodes. Of course it's mentioned in the installation guide.
Did you remember to restart the munge daemon after copying the munge.key to /etc/munge? I got the same error doing
1: install slurm:
$ apt install -y slurm-client
2: copy slurm.conf
(perhaps create slurm-llnl beforehand):
$ cp slurm.conf /etc/slurm-llnl
3: copy munge key to client
(munge.key copied before from slurm server/slurmctld)
$ cp munge.key /etc/munge
and then I got all the invalid credetial errors and problems reported here and in reports including the 'Zero Bytes' error on the client side
[CLIENT]$ sinfo
slurm_load_partitions: Zero Bytes were transmitted or received
with corresponding entries in the Slurm SERVER/slurmctld logs ala
[SERVER]$ tail /var/log/munge/munged.log
2022-12-30 22:57:23 +0100 Notice: Running on ..
2022-12-30 23:01:11 +0100 Info: Invalid credential ...
and
[SERVER]$ tail /var/log/slurm-llnl/slurmctld.log
[2022-12-30T23:01:11.440] error: Munge decode failed: Invalid credential
[2022-12-30T23:01:11.440] ENCODED: Thu Jan 01 01:00:00 1970
[2022-12-30T23:01:11.440] DECODED: Thu Jan 01 01:00:00 1970
[2022-12-30T23:01:11.440] error: slurm_unpack_received_msg: REQUEST_PARTITION_INFO has authentication error: Invalid authentication credential
[2022-12-30T23:01:11.440] error: slurm_unpack_received_msg: Protocol authentication error
All of this is fixed by rebooting the client, as suggested by other here, or slightly less intrusive, just to restart the client munge daemon
(CLIENT)$ sudo systemctl restert munge.service
and then munge on client / unmunge on server works, but it also fixes my main problem of getting client to see the slurm server without the dreaded 'Zero Bytes' error
[CLIENT]$ sinfo
slurm_load_partitions: Zero Bytes were transmitted or received
with server log entries
[SERVER]$ tail /var/log/slurm-llnl/slurmctld.log
...
[2022-12-30T23:17:14.017] error: slurm_unpack_received_msg: Invalid Protocol Version 9472 from uid=-1 at XX.XX.XX.XX:44150
[2022-12-30T23:17:14.017] error: slurm_unpack_received_msg: Incompatible versions of client and server code
[2022-12-30T23:17:14.027] error: slurm_receive_msg [XX.XX.XX.XX:44150]: Unspecified error
And, after munge restart, voilà:
[CLIENT] $ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
LocalQ* up infinite 1 idle XXX
for the examples: SERVER Ubuntu 20.04, CLIENTS Ubuntu 20.04 (and 22.04 that seem to be incompatible with the SERVER slurm version, says the log)

openproject configure throws "ERROR 1045 (28000): Access denied for user 'root'#'localhost' (using password: YES)"?

Installed openproject according to the CentOS 7 docs (https://www.openproject.org/download-and-installation/#installation) and after running sudo openproject configure, getting the error output as shown:
➜ ~ sudo openproject configure
Launching installer for openproject...
Selected addons: legacy-installer mysql apache2 repositories smtp memcached openproject
[legacy-installer] ./bin/configure
[mysql] ./bin/configure
DONE
[apache2] ./bin/configure
DONE
[repositories] ./bin/configure
DONE
[smtp] ./bin/configure
DONE
[memcached] ./bin/configure
DONE
[openproject] ./bin/configure
[legacy-installer] ./bin/preinstall
[mysql] ./bin/preinstall
[apache2] ./bin/preinstall
Note: Forwarding request to 'systemctl enable httpd.service'.
[repositories] ./bin/preinstall
[smtp] ./bin/preinstall
[memcached] ./bin/preinstall
No memcached server to install. Skipping.
[openproject] ./bin/preinstall
[legacy-installer] ./bin/postinstall
[mysql] ./bin/postinstall
ERROR 1045 (28000): Access denied for user 'root'#'localhost' (using password: YES)
Never used MariaDB before, but from some basic checking (installed following https://www.digitalocean.com/community/tutorials/how-to-reset-your-mysql-or-mariadb-root-password)
➜ ~ mysqladmin -u root -p version
Enter password:
mysqladmin Ver 9.0 Distrib 5.5.60-MariaDB, for Linux on x86_64
Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.
Server version 5.5.60-MariaDB
Protocol version 10
Connection Localhost via UNIX socket
UNIX socket /var/lib/mysql/mysql.sock
Uptime: 21 min 38 sec
Threads: 1 Questions: 16 Slow queries: 0 Opens: 7 Flush tables: 2 Open tables: 25 Queries per second avg: 0.012
(where the passsword is just blank) the DB appears to be working and accessible. Removing and re-installing the mariadb-server package does not appear to change the behavior either. Never used openproject or mysql / mariadb before, so any advice on what could be happening here would be appreciated.
Did you choose Install and configure MySQL server locally during the configuration step?
If so, the package should automatically configure a random MySQL root password that is stored under the key mysql/root_password at /etc/openproject/installer.dat. If you can still access the database with an empty password, the setup failed to set a password correctly. It appears to take input from urandom.
What you could try is running the following command while hitting enter in the MySQL password prompt:
mysqladmin -u root -p password "<Create and insert a random password here>"
And then replacing the value at mysql/root_password at /etc/openproject/installer.dat.

Resources