Which process sends SIGKILL and terminates all SSH connections on/to my Namecheap Server? - linux

I've been trying to troubleshoot this problem for some days now.
A couple of minutes after starting an SSH connection to my Namecheap server (on Mac/windows/cPanel's "Terminal"), it crashes and give the following error message :
Error: The connection to the server ended in failure at {TIME} PM. (SIGKILL)
and :
Exit Code: 137
I've tried to create some kind of log file for any SIGKILL signal, but, it seems like none can be made on a Namecheap server :
auditctl doesn't exist,
We can't get systemtap because no package managers are available.
Precision :
uname -a : Linux [-n] 2.6.32-954.3.5.lve1.4.78.el6.x86_64 #1 SMP Thu Mar 26 08:20:27 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
I calculated the time between each crash : around 6min.
I don't have a very good knowledge of Linux servers, and maybe didn't include needed information. So please ask for any specificities!

Related

Elastisearch Enabling Remote Connection - Crashes AFTER Change*

I just installed filebeat, logstash, kibana and elasticsearch all running smoothly just to trial this product out for additional monthly reports/monitoring and noticed every time I try to change the "/etc/elasticsearch/elasticsearch.yml" config file for remote web access it'll basically crash the service every time I make the change.
Just want to say I'm new to the forum and this product, and my end goal for this question is to figure out how to allow remote connections to access elastisearch as I guinea pig and test without crashing elasticsearch.
For reference here is the error code when I run the 'sudo systemctl status elasticsearch' query:
Dec 30 07:27:37 ubuntu systemd[1]: Starting Elasticsearch...
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: ERROR: [1] bootstrap checks failed. You must address the points described in the following [1] lines before starting Elasticsearch.
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: bootstrap check failure [1] of [1]: the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.se>
Dec 30 07:27:52 ubuntu systemd-entrypoint[4067]: ERROR: Elasticsearch did not exit normally - check the logs at /var/log/elasticsearch/elasticsearch.log
Dec 30 07:27:53 ubuntu systemd[1]: elasticsearch.service: Main process exited, code=exited, status=78/CONFIG
Dec 30 07:27:53 ubuntu systemd[1]: elasticsearch.service: Failed with result 'exit-code'.
Dec 30 07:27:53 ubuntu systemd[1]: Failed to start Elasticsearch.
Any help on this is greatly appreciated!

dovecot unable to start due to address already in use

I upgraded my Linux kernel and dovecot failed to start with the following error messages:
Error: service(managesieve-login): listen(*, 4190) failed: Address already in use
Error: service(pop3-login): listen(*, 110) failed: Address already in use
Error: service(pop3-login): listen(*, 995) failed: Address already in use
Error: service(imap-login): listen(*, 143) failed: Address already in use
Error: service(imap-login): listen(*, 993) failed: Address already in use
Fatal: Failed to start listeners
Strangely enough, I couldn't find any process bounded to those port numbers. All commands below return nothing.
# netstat -tulpn | grep 110
# ss -tulpn |grep 110
# fuser 110/tcp
# lsof -i :110
I also tried to change the listen setting to my specific IP address and it still failed the same way.
Any idea how I can solve this problem? Here's my version info:
# uname -a
Linux ip-172-31-26-222 4.14.177-107.254.amzn1.x86_64 #1 SMP Thu May 7 18:30:14 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
# dovecot --version
2.2.36 (1f10bfa63)
Hi it looks like you are using AWS as I am. I recently updated via Yum as well. I noticed that a new package named 'portreserve' was also installed. I killed that process, left the /etc/dovecot/dovecot.conf as it was before and then started Dovecot successfully. I was also immediately able to reconnect my mail clients connection. I hope that helps you.
I also restarted the portreserve program since it seems useful to limit port access.

Ubuntu Xenial time discrepancy with VM and Windows host

I have 3 new fresh installs of Ubuntu 16.04.2 LTS xenial on Azure VM, in the system log I noticed I have a time discrepancy and the system is logging this ever 5 seconds.
Mar 5 17:57:57 server1 systemd[1]: snapd.refresh.timer: Adding 2h 17min 4.279485s random time.
Mar 5 17:57:57 server1 systemd[1]: apt-daily.timer: Adding 5h 14min 48.381690s random time.
Mar 5 17:57:57 server1 systemd[19425]: Time has been changed
Mar 5 17:57:57 server1 systemd[37054]: Time has been changed
I have stopped the two services: apt-daily.timer and snapd.refresh.timer, and the "Time has been changed" messages still persist. It seems to be a time discrepancy between the VM and host system. I am not sure how to address this. I also have VMs of the same exact version that I installed over a month ago on Azure and they don't show this error.
Thanks for guidance on this

GPFS : mmremote: Unable to determine the local node identity

I had a 4 node, gpfs cluster up and running, and things were fine till last week when the Server hosting these RHEL setups went down, After the server was brought up and rhel nodes were started back, one of the nodes's IP got changed,
After that I am not able to use the node,
simple commands like 'mmlscluster', mmgetstate', fails with this error:
[root#gpfs3 ~]# mmlscluster mmlscluster: Unable to determine the local
node identity. mmlscluster: Command failed. Examine previous error
messages to determine cause. [root#gpfs3 ~]# mmstartup mmstartup:
Unable to determine the local node identity. mmstartup: Command
failed. Examine previous error messages to determine cause.
Mmshutdown fails with different error:
[root#gpfs3 ~]# mmshutdown mmshutdown: Unexpected error from
getLocalNodeData: Unknown environmentType . Return code: 1
logs have this info:
Mon Feb 15 18:18:34 IST 2016: Node rebooted. Starting mmautoload...
mmautoload: Unable to determine the local node identity. Mon Feb 15
18:18:34 IST 2016 mmautoload: GPFS is waiting for daemon network
mmautoload: Unable to determine the local node identity. Mon Feb 15
18:19:34 IST 2016 mmautoload: GPFS is waiting for daemon network
mmautoload: Unable to determine the local node identity. Mon Feb 15
18:20:34 IST 2016 mmautoload: GPFS is waiting for daemon network
mmautoload: Unable to determine the local node identity. Mon Feb 15
18:21:35 IST 2016 mmautoload: GPFS is waiting for daemon network
mmautoload: Unable to determine the local node identity. Mon Feb 15
18:22:35 IST 2016 mmautoload: GPFS is waiting for daemon network
mmautoload: Unable to determine the local node identity. mmautoload:
The GPFS environment cannot be initialized. mmautoload: Correct the
problem and use mmstartup to start GPFS.
I tried changing the IP to new one, still the same error:
[root#gpfs1 ~]# mmchnode -N gpfs3 --admin-interface=xx.xx.xx.xx Mon Feb 15 20:00:05 IST 2016:
mmchnode: Processing node gpfs3 mmremote: Unable to determine the
local node identity. mmremote: Command failed. Examine previous error
messages to determine cause. mmremote: Unable to determine the local
node identity. mmremote: Command failed. Examine previous error
messages to determine cause. mmchnode: Unexpected error from
checkExistingClusterNode gpfs3. Return code: 0 mmchnode: Command
failed. Examine previous error messages to determine cause.
Can someone please help me in fixing this issue?
The easiest fix is probably to remove the node from the cluster (mmdelnode) and then add it back in (mmaddnode). You might need to mmdelnode -f.
If deleting and adding the node back in is not an option, try giving IBM support a call.

Apache: Failed to configure CA certificate chain

Pre-note: The certificates was purchased from a vendor and are valid till 2018
Our Apache for one of our servers (Ubuntu 12.04) crashed this morning. Trying to restart Apache kept giving us the following error message
[Wed Jun 03 12:21:51.875811 2015] [ssl:emerg] [pid 30534] AH01903: Failed to configure CA certificate chain!
[Wed Jun 03 12:21:51.875846 2015] [ssl:emerg] [pid 30534] AH02311: Fatal error initialising mod_ssl, exiting. See /var/log/apache2/error.log for more information
After removing the following line from apache config
SSLCertificateChainFile /etc/apache2/ssl/wck.bundle
Apache reloaded.
The server did not restart so I am sure no updates where done by accident.
I then proceeded to try and get it up and running on one of the 14.04 Ubuntu servers we own. The same problem occurred with the same certificates. I asked the guy who setup the 14.04 apache and he claims the problem we suddenly experienced today with the 12.04 server has always happened on the 14.04 server.
I tried reproducing the error on my local 14.04 by installing a new Apache and copying the certificates and one of the config files for one of the sites to my local machine. On my local machine after the setup everything worked perfectly.
I have tried comparing openssl versions, lib version between the two 14.04, but everything looks the same. I even upgraded both my local machine and the 14.04 server to ensure the libs and Apache version are identical, but the one works and the other one doesn't and I recon If I can solve this problem for the 14.04 Ubuntu server it will provide me with the information to get the ssl certificate chain up and running on the 12.04 Machine.
Does anyone have an idea why suddenly the 12.04 Ubuntu's Apache would stop working with the ssl certaficate chain and the 14.04 server also produces the same error, but my local 14.04 does not?
Any help would be appreciated.
Thanks in advance.

Resources