SSH Daemon (NIO2) is not starting anymore - linux

I have two VM running gitblit under jetty version 9.2.11. Both are using java 8 (lastest update "1.8.0_77") in a ubuntu server 14.04.
The only difference that I've noticed is the kernel version
One of them is
2.6.32-042stab111.12 #1 SMP Thu Sep 17 11:38:20 MSK 2015 x86_64 x86_64 x86_64 GNU/Linux (lets call it server 1)
and the other one is
2.6.32-042stab113.21 #1 SMP Wed Mar 23 11:05:25 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux (lets call it server 2)
On server 1, everything works fine.
But on Server 2, the context of gitblit is not up and runinng.
The last record in log is:
2016-04-12 22:22:53 [INFO ] Federation passphrase is blank! This server can not be PULLED from.
2016-04-12 22:22:53 [INFO ] Fanout PubSub service is disabled.
2016-04-12 22:22:53 [INFO ] Git Daemon is listening on 0.0.0.0:9419
After that, jetty's service is failing, and context isn't available. The application stays with status STARTING forever.
I've tried to reinstall ssh server and client with no success.
Can someone help me with that?
Regards

A few days after I post my question I've found the correct answer.
I followed the steps bellow to detect the root cause of problem:
I've downloaded the source code of gitblit's version I'm using. In this case, version 1.7.1 available here.
With source code I added some quick logs (using sysout) only to check when the application freezing. I noticed that problem was in code of Apache MINA sshd.
I've also downloaded the source code of Apache MINA sshd to debug. In this case I wasn't able to add sysout, then I chose to do a remote debugging in my jetty running on server as described here.
Of course that it was a little bit slowly, but I noticed that when the sshd's code called SecureRandom.generateSeed(8) in class SecurityUtils.BouncyCastleRandom
public BouncyCastleRandom() {
ValidateUtils.checkTrue(isBouncyCastleRegistered(), "BouncyCastle not registered");
this.random = new VMPCRandomGenerator();
byte[] seed = new SecureRandom().generateSeed(8);
this.random.addSeedMaterial(seed);
}
The system used to freeze completely.
After a long time searching on the internet I've found this link/blog https://blog.cloudflare.com/ensuring-randomness-with-linuxs-random-number-generator/ and when I typed the command cat /proc/sys/kernel/random/entropy_avail I was always getting 0 (zero) as result.
I was aware that my Linux is a VPS running under openvz hosted in host1plus. Based on that, I asked to VPS provider check the reason that in my VPS I was always getting zero as result.
The answer from Technical support was:
We have enabled the random device for your VPS. Please check if it works works for you and if the issue is solved.
After that update my gitblit was backing to up and running state.

Related

Are docker images and containers visible to everyone on the server?

I am trying to make a docker image for one of my simulators. But is the docker image on the server shared by all users?
I just tried the docker images command, and the results showed that there are several images:
ubuntu latest 9873176a8ff5 2 months ago 72.7MB
hello-world latest d1165f221234 6 months ago 13.3kB
mpx_evaluation latest ae93b04419ab 13 months ago 686MB
ubuntu 16.04 77be327e4b63 18 months ago 124MB
e9patch/e9patchdemo latest e73fd4d392d8 19 months ago 696MB
neo4j latest 7e40ffda399a 2 years ago 362MB
Are these images used by others? I don’t want everyone to be able to see the image I made. Is there any way? Why is the images public to everyone? What if someone deletes my image by mistake?
The model of the server I am using is: Linux server81 4.15.0-142-generic #146~16.04.1-Ubuntu SMP Tue Apr 13 09:27:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Many people use this server.
Whoever has access to the docker socket, which is an interface to control the docker daemon (which is for example responsible for building a docker image), will see your images on the host machine. In general, users with root access have access to the docker socket and hence can see your images.
If you were the only root user on the machine, other users would not have access to the images in case that they were not in the docker group (https://docs.docker.com/engine/install/linux-postinstall/).
But other than that I don't see a possibility on how to make your images "private" from other users controlling the docker socket.

Upgrading SLES 15.1 to 15.2 causing Varnish to Fail

Very recently I ran an Online Migration update through YaST on SUSE Linux Enterprise Server (SLES) 15.1 to 15.2 and ended up with the following versions of these after doing so:
SLES 15.2
Apache 2.4.43
MariaDB 10.4.17
PHP 7.4.6
Varnish 6.2.1
My main linux architecture is now as follows:
The preliminary tests showed no conflicts or issues prior to the upgrade and it rebooted and came up just fine when it all completed. Upon checking everything afterwards, I noticed that the varnish.service (varnishd) had failed to start. I've never had an issue with Varnish not starting, whether it was SUSE Linux, CentOS, Ubuntu, etc. I thought at first my custom vcl file was causing issues so I went with the default configuration file that it comes with (/etc/varnish/vcl.conf) just to start fresh with the basics but to no avail. The exact same issue happened.
Then I decided to take a shot and compile Varnish from source. Through YaST, I removed the varnish package and all of its configuration and service files, and then I downloaded the latest TAR Archive file (varnish-6.6.0.tgz) direct from https://varnish-cache.org/. After compiling and making Varnish this way, ironically, the same issue is happening when I try to start Varnish.
As with either, compiled (v6.6.0) or service package (v6.2.1), I get the following error(s) exactly the same between the two:
It describes a "Child not responding to CLI, killed it." and then proceeds to mention there's a "CLI communication error (hdr)." And finally a, "Child died signal=6."
What's most puzzling is that with either way of setting up Varnish, is that it fails the exact same way. I supposed this would indicate that Varnish isn't the issue per se, but rather something within the server configuration? I've been through every forum on Varnish that I could find and have found nothing this specific. I have even tried to get it to start by trying different CLI parameters (like timeout settings, pool delays, etc.) but it still won't do it. Again, this is with having the most basic/default configuration file loaded and nothing else.
# Marker to tell the VCL compiler that this VCL has been adapted to the
# new 4.0 format.
vcl 4.0;
# Default backend definition. Set this to point to your content server.
backend default {
.host = "127.0.0.1";
.port = "80";
}
Now here's the ultimate kicker... I took another (Development) server, slicked it bare, and installed SLES 15.2 from scratch and everything, including Varnish, works! So something with the in-place upgrade is stopping Varnish somehow. I can't take the main (Production) SLES 15.2 server and start over with it like that, however, because of so many other things that are currently installed and configured on it.
I'm trying to get Varnish back up and started within the current upgraded environment, but nothing seems to be working. Also, there is nothing in the Varnish logs (/var/log/varnish/varnish.log) to give me any clue either.
I'm at a loss as to what to try or where to go next. I've even tried starting Varnish in Debug Mode (-d) and then trying to get a child to start that way and it's the exact same error.
And ultimately, I can't check for any panics because Varnish won't even start in the first place.
So to recap, literally all I did was run the in-place upgrade from SLES 15.1 to 15.2, rebooted when it was all done, and now all other services start fine except for Varnish (which worked perfectly on 15.1).
UPDATE #1: I tried to start varnish with no vcl file and no backend (varnishd -b none) but it errored out. Then I simply substituted "none" with "localhost" and I'm right back to the same error as before.
UPDATE #2: Here is the output of the "strace -f varnishd" command.
StraceOutput.txt
VCL loop
This is a long shot, but can you please change the .port property in your backend to 8080 instead of 80? Just for testing.
Because if you start varnishd without an explicit -a, the standard listening port will be 80. But since your VCL file already connects to port 80 on localhost for its backend, you might end up in a loop.
I'm not saying the assert() that is triggered on your system is caused by this, but it's worth the attempt.
In older versions of Varnish, the standard port was 6081, but this has changed in recent versions.
What I am sure of, is that the error is caused by a file descriptor that is not available. Maybe a file descriptor that has already been closed.
Please give it a shot, and let me know.
Debug mode
It is also possible to enable debug mode by adding the -d runtime parameter to your varnishd command.
Please give it a try to increase the verbosity of the debug output
Checking panics
Another thing you can do is run the following command to see if any panics occured:
varnishdadm panic.show
Trying out various runtime options
Apparently the error refers to the fact that it cannot load the VCL file.
Let's try running varnishd without a VCL file to see whether or not that's the problem.
Just try starting varnishd using the following command:
varnishd -b none
This command will start Varnish without a VCL file and without a backend. When you then try to access Varnish via HTTP, you should be getting an HTTP 503 error. That's not perfect, but at least we know that Varnish is capable of not crashing all the time.
Once that works, you can remove -b and add your -f parameter that refers to the VCL file
If that also works, try playing around with the -s setting
And so on, and so forth ..
Use packages
Other than that, the only advise I can give you is to install Varnish using the official packages on a supported operating system (Debian, Ubuntu, Fedora, CentOS, RHEL).
When checking the output of the requested strace command, I found this:
[pid 1129] mkdir("vcl_boot.1621874391.008263", 0755) = 0
[pid 1129] chown("vcl_boot.1621874391.008263", 465, 463) = 0
[pid 1129] setresuid(-1, 465, -1) = 0
[pid 1129] openat(AT_FDCWD, "vcl_boot.1621874391.008263/vgc.c", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 5
[pid 1129] fchown(5, 0, 0) = -1 EPERM (Operation not permitted)
[pid 1129] geteuid() = 465
[pid 1129] close(5) = 0
[pid 1129] openat(AT_FDCWD, "vcl_boot.1621874391.008263/vgc.so", O_WRONLY|O_CREAT|O_TRUNC, 0640) = 5
[pid 1129] fchown(5, 0, 0) = -1 EPERM (Operation not permitted)
Varnishd tries to change the owner of at least two files, but isn't allowed to do so. I'm not sure about the details, but as a next step you could try to find these files (probably below /var/cache/varnish) and check the current permissions. Maybe they belong to a user which is not the user you're running varnishd with.
AFAIK the daemon is started as user root and then the process switches to an unprivileged user. This assumption brings us back to my previous question: Are you running AppArmor or SElinux?

VS Code SSH Remote Connection issues

I have been using VS Code and connecting remotely from home on my MacBookPro to work on a college project for the past month and for some reason it will not connect to the Computer Lab Server anymore. No idea why this is happening but it just stopped working today. I tried re-installing vs code and also installed it on my wife's computer but it still wont connect through remote ssh. No idea why this is happening but now I have no way to debug my code and have to just edit everything using emacs through the terminal app on my mac. I didn't make any changes from last night to this morning.. I can still ssh into the Computer Lab server from terminal fine. Bellow is some of the log that seems to repeat itself while it is trying to connect using the extension: remote ssh.
Any help on this would be greatly appreciated, or are there other IDE's that are somewhat easy to connect remotely through ssh available for Mac?
MY LOG:
17:09:21.150] Log Level: 2
[17:09:21.152] remote-ssh#0.55.0
[17:09:21.152] darwin x64
[17:09:21.153] SSH Resolver called for "ssh- remote+7b22686f73744e616d65223a226c696e75782e63732e75736d2e6d61696e652e656475222c2275736572223a22746b7766c6b227d", attempt 1
[17:09:21.154] SSH Resolver called for host: tkwilk#linux.cs.usm.maine.edu
[17:09:21.154] Setting up SSH remote "linux.cs.usm.maine.edu"
[17:09:21.158] Acquiring local install lock: /var/folders/9y/scfwvr0577qfgs_l_c5ym13m0000gq/T/vscode-remote-ssh-tkwilk#linux.cs.usm.maine.edu-install.lock
[17:09:21.192] Looking for existing server data file at /Users/twilk31888 1/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-tkwilk#linux.cs.usm.maine.edu-93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3-0.55.0/data.json
[17:09:21.194] Using commit id "93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3" and quality "stable" for server
[17:09:21.195] Install and start server if needed
[17:09:21.220] Checking ssh with "ssh -V"
[17:09:21.233] > OpenSSH_8.1p1, LibreSSL 2.7.3
[17:09:21.249] askpass server listening on /var/folders/9y/scfwvr0577qfgs_l_c5ym13m0000gq/T/vscode-ssh-askpass-a45a56dcf061823c964fa6ae7ff720ac39d2477f.sock
[17:09:21.249] Spawning local server with {"ipcHandlePath":"/var/folders/9y/scfwvr0577qfgs_l_c5ym13m0000gq/T/vscode-ssh-askpass-c1cf58194111018972f9cf0cd413a94b7293bda9.sock","sshCommand":"ssh","sshArgs":["-v","-T","-D","54601","-o","ConnectTimeout=15","tkwilk#linux.cs.usm.maine.edu"],"dataFilePath":"/Users/twilk31888 1/Library/Application Support/Code/User/globalStorage/ms-vscode-remote.remote-ssh/vscode-ssh-host-tkwilk#linux.cs.usm.maine.edu-93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3-0.55.0/data.json"}
[17:09:21.249] Local server env: {"DISPLAY":"1","ELECTRON_RUN_AS_NODE":"1","SSH_ASKPASS":"/Users/twilk31888 1/.vscode/extensions/ms-vscode-remote.remote-ssh-0.55.0/out/local-server/askpass.sh","VSCODE_SSH_ASKPASS_NODE":"/Applications/Visual Studio Code.app/Contents/Frameworks/Code Helper (Renderer).app/Contents/MacOS/Code Helper (Renderer)","VSCODE_SSH_ASKPASS_MAIN":"/Users/twilk31888 1/.vscode/extensions/ms-vscode-remote.remote-ssh-0.55.0/out/askpass-main.js","VSCODE_SSH_ASKPASS_HANDLE":"/var/folders/9y/scfwvr0577qfgs_l_c5ym13m0000gq/T/vscode-ssh-askpass-a45a56dcf061823c964fa6ae7ff720ac39d2477f.sock"}
[17:09:21.262] Spawned 4239
[17:09:21.373] > local-server> Spawned ssh: 4240
[17:09:21.379] stderr> OpenSSH_8.1p1, LibreSSL 2.7.3
[17:09:21.756] stderr> debug1: Server host key: ecdsa-sha2-nistp256 SHA256:wny4SU/uVC6y9cUUH5kJnRe5SVWpBhWGABpWSYzMNG0
[17:09:22.132] stderr> Authenticated to linux.cs.usm.maine.edu ([130.111.131.121]:22).
[17:09:22.490] > ready: 946b80caa0f2
[17:09:22.553] > Linux 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020
[17:09:22.554] Platform: linux
[17:09:22.685] > 946b80caa0f2: running
[17:09:22.713] > Acquiring lock on /home/students/tkwilk/.vscode-server/bin/93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3/vscode-remote-lock.tkwilk.93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3
> Installation already in progress...
> 946b80caa0f2##24##
[17:09:22.714] Received install output: 946b80caa0f2##24##
[17:09:22.714] Server installation process already in progress - waiting and retrying
[17:09:22.714] Terminating local server
[17:09:22.740] Local server exit: 15
The key info is provided at the line
[17:09:22.713] > Acquiring lock on /home/students/tkwilk/.vscode-server/bin/93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3/vscode-remote-lock.tkwilk.93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3
If you could ssh into the server and remove the file by
rm -rf /home/students/tkwilk/.vscode-server/bin/93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3/vscode-remote-lock.tkwilk.93c2f0fbf16c5a4b10e4d5f89737d9c2c25488a3
then reboot the vscode and try to connect, things should be fine.
Encountered the same problem on two servers with two different causes:
One problem is solved by referring to this issue: #2805
Command Palette -> Select "Remote-SSH: Kill VS Code Server on Host..."
Remove the directory of "~/.vscode-server" on remote server.
The other problems, is caused by running out of storage quota on that server. And the issue was automatically solved when the quota was increased.
Most of the microsoft/vscode-remote-release I see, like issue 2901, are about a failed symlink on the target server.
If you can ssh in command line, try and rename /home/students/tkwilk/.vscode-server in order to force a complete re-installation of the SSH remote plugin by VSCode.
mv ~/.vscode-server ~/.vscode-server-old
Try and connect to that server through VSCode and see if the issue persists, when it tries to redo the complete vscode-server SSH setup.
I found a new reason, but it may be rare:
Before I found this problem, I had updated and modified the linux kernel of the remote virtual machine, and modified the UTS_SYSNAME located in /include/linux/uts.h;
#define UTS_SYSNAME "Linux Clstilmldy-LZM"
// #define UTS_SYSNAME "Linux"
So I met this problem, but I never found a feasible solution;
I carefully looked at the vscode output and found that vscode remote ssh: Unsupported platform: Linux Clstilmldy LZM;
[16:38:25.333] SSH Resolver called for host: Ubuntu
[16:38:25.334] Setting up SSH remote "Ubuntu"
...
[16:38:35.555] Got password response
[16:38:35.555] "install" wrote data to terminal: "******"
[16:38:35.574] >
[16:38:36.069] > ac25402ecd5f: running
[16:38:36.086] > Unsupported platform: Linux Clstilmldy-LZM
[16:38:36.096] > ac25402ecd5f: start
I guess that vscode remote ssh does not recognize system names other than Linux, Mac, and Windows, so I changed this line back.
I recompile and install the kernel.
okkk, I solve the problem.
Another answer, since none of these worked for me. Try toggling off the following setting in VSCode: remote.SSH.useFlock

Installing ColdFusion 9 on Ubuntu 14.04, getting error running connector wizard

The Setup:
I'm trying to install ColdFusion 9 on Ubuntu 14.04 with Apache 2.4.7. Seriously. Don't ask.
Spun up a Vagrant Box (xplore/ubuntu-14.04) that has the LAMP stack installed;
Performed apt-get update and apt-get upgrade;
Installed libstdc++5 (but still got a warning that CF couldn't verify it was installed);
Installed CF from ColdFusion_9_WWEJ_linux64.bin.
I had to create a symlink to /etc/apache2/apache2.conf called /etc/apache2/httpd.conf in order to get CF installed, because CF9 doesn't allow you to specify an apache config filename, but other than that everything went smoothly.
The Problem:
When I start CF using ./opt/coldfusion9/bin/coldfusion start I get this message:
There was an error while running the connector wizard
Connector installation was not successful
...which is the result of cf-connectors.sh modifying my apache2.conf, telling it to load the module /opt/coldfusion9/runtime/lib/wsconfig/1/mod_jrun22.so, then attempting to restart Apache and failing due to this error:
apache2: Syntax error on line 223 of /etc/apache2/apache2.conf:
Cannot load /opt/coldfusion9/runtime/lib/wsconfig/1/mod_jrun22.so into server:
/opt/coldfusion9/runtime/lib/wsconfig/1/mod_jrun22.so:
undefined symbol: ap_log_error
Troubleshooting Steps Taken:
I tailed the Apache error log, but that wasn't much help:
[mpm_prefork:notice] [pid 1516] AH00173: SIGHUP received. Attempting to restart
[mpm_prefork:notice] [pid 1516] AH00163: Apache/2.4.7 (Ubuntu) PHP/5.5.9-1ubuntu4.3 configured -- resuming normal operations
[core:notice] [pid 1516] AH00094: Command line: '/usr/sbin/apache2'
The JRun binary file does exist, in /opt/coldfusion9/runtime/bin/jrun. However, I've seen tutorials like this one that show it being located in /opt/jrun4...which is weird because my version of CF9 is referencing mod_jrun22.so, leading me to believe there is a version difference.
Running ./opt/coldfusion9/runtime/bin/jrun status, I get this output:
The coldfusion server is running
No jndi.properties file was found in samples's SERVER-INF directory. The JRun kernel requires JNDI information.
The samples server is not running
The admin server is not running
...which tells me that there is a missing indi.properties file, and that the samples and admin servers are not running. I assume that is a result of cf-connectors.sh failing.
The Question:
How can I get the CF connector wizard to succeed? What am I missing here?
Thanks in advance!
Apache 2.4.x is not supported by Coldfusion 9, see my answer here:
Apache won't start with ColdFusion 10: mod_jk.conf procedure not found
I suggest you install Apache 2.2 and then you should be able to install the Connector.
While Apache 2.4 is not supported by Adobe, it is possible to get it running but recompiling the mod_jrun module against the Apache 2.4 sources (after a small modification to the source code).
There are full instructions on my blog post, if you're still interested.
mod_jrun on Apache 2.4 (Ubuntu 14.04 + ColdFusion 9)

OTRS installation error on openSUSE

I have a fresh, text-only installation of openSuSe 13.1 (physical server, old Samsung netbook), and I'm trying to get OTRS up and running. I've installed OTRS using the below commands. I don't think they're all necessary, but someone in the OtterHub forums had a successful installation with the software versions I'm targeting using this sequence, so I was trying to piggyback on that success.
zypper in otrs-3.3.4-01.noarch.rpm gcc make mysql-community-server perl-Crypt-SSLeay perl-JSON-XS perl-YAML-LibYAML
zypper in perl-Text-CSV_XS perl-PDF-API2 perl-GDGraph perl-Encode-HanExtra postfix perl-DBD-mysql
cd ~otrs && bin/otrs.SetPermissions.pl --otrs-user=otrs --web-user=wwwrun --otrs-group=www --web-group=www /opt/otrs
rcmysql start
systemctl start apache2.service
mysqladmin --user=root password password
All of that works fine. I'm able to get to the OTRS web installer, but that's where I get hung up. I get to the part of the web installer that creates the database, and it times out. The script successfully creates the database and updates Config.pm with the new password. I can't tell from installer.pl what it tries to do next.
Here's the error from /var/log/apache2/error_log:
[Tue Jan 28 20:53:23.136306 2014] [cgi:warn] [pid 6856] [client 192.168.1.10:52732] AH01220: Timeout waiting for output from CGI script /opt/otrs/bin/cgi-bin/installer.pl, referer: http://svr-clptest/otrs/installer.pl
[Tue Jan 28 20:53:23.136470 2014] [cgi:error] [pid 6856] [client 192.168.1.10:52732] Script timed out before returning headers: installer.pl, referer: http://svr-clptest/otrs/installer.pl
The browser displays the following:
The gateway did not receive a timely response from the upstream server or application.
This is on a local network at home. I'm accessing the Linux server using PuTTY from a Windows 8 machine. I'm using a wireless connection from the Windows 8 machine, but the server has a hard line connection to the router, if that makes any difference. I don't have any trouble executing anything from PuTTY or accessing the index page through the browser (Firefox 26). I've tried connecting from a computer on my network, and one off of my network. In both cases, I'm able to get to my domain and the web installer. But I can't make a PuTTY connection to the server from outside my network.
I've spent a couple of hours researching the error, and I can't figure out what the next step should be.
Right now, a text-only version of openSUSE and OTRS are the only things running on the machine. I haven't done anything else with it. I'm open to starting the installation from scratch again--OS and all. I'm thinking that the timeout error has something to do with my firewall settings, but I'm not a network guy. Really have no idea how to diagnose this.
UPDATE
I tried reinstalling everything fresh tonight, but then added KDE so I could walk through the web installer on the host. I get exactly the same errors. It's not a problem between server and client. Something's wrong with OTRS... Or maybe with apache?
I eventually just had to follow the steps for manual installation instead of using the web installer. Not sure where the problem was exactly, but no matter what I tried, I couldn't get the database setup to work through the web installer. If you're having a similar problem, once you get to the part of the instructions that tell you to move to the web installer, you can switch over to the instructions to install from source and pick it up from manual installation of the database.

Resources