Nagios - unable to read output - linux

I make custom bash script to monitor ssh failed logins - which locally runs ok - on nagios server and remote hosts.
root#xxx:/usr/local/nagios/libexec# ./check_bruteforce_ssh.sh -c 20 -w 50
OK - no constant bruteforce attack
But on nagios page - shows Unable to read output
I make so changes in configs to verify form https://support.nagios.com/kb/article/nrpe-nrpe-unable-to-read-output-620.html what's going wrong and I cannot find out where is the problem.
Script runs via nrpe which run on all machine
root#test:/usr/local/nagios/libexec# ./check_nrpe -H test1
NRPE v3.2.1
When I tested script via nrpe I've got problem with
NRPE: Command 'check_bruteforce_ssh' not defined
which is defined in nrpe.cfg
command[check_bruteforce_attack]=/usr/local/nagios/libexec/check_bruteforce_attack.sh -w 20 -c 50
All permissions for user nagios is added - in sudoers etc.
Where can I find the solution or somedoby got similar problem?

You have an error in your definition.
Replace check_bruteforce_attack in nrpe.cfg with check_bruteforce_ssh and it will work ;-)

Related

NRPE Socket timeout via NRPE, works as nrpe user

nrpe on azure server - nrpe-srvr, user nrpe, executing script /usr/local/naemon/libexec/check_curl_http.php I'll call it script
Desired output after ./script -U www.google.com:
Page OK: HTTP Status Code 200 - 11099 bytest in 0.** seconds | time=0.059 size=11099
I achieve the above output by running the script from root or nrpe
Running sudo -u nrpe ./script -U www.google.com returns:
Error in opening page! Err:Failed to connect to [ipv6 addr] Network is
unreachable
However running su - nrpe -c './script -U www.google.com' works with the desired result.
Naemon reports:
CHECK_NRPE: Socket timeout after 30 secs
Other NRPE checks to the same host are working, so I think it's something to do with user execution of this specific script. I did have a deny from SELinux, but adjusted the context. Removing the context and setting SELinux to permissive yielded the same error. Enabled NRPE Log files, with debugging, but other than Running command it doesn't really reveal much. There is a:
WARNING: my_system() seteuid(0): Operation not permitted
in the logs, but looking at the support documentation that is "Normal" behavior.
I'll post this just in case someone else has this issue, and I'll tag Azure / AWS.
Essentially, cloud providers (mostly) have an internal proxy that is stored in an environment variable http_proxy && https_proxy. NRPE by default doesn't use load environment variables. Now I don't know if there is an option for it (it's mentioned in the docs that there is a bug when using uid instead of username (was using username)) however it's simple enough to call proxy for checks like this.

Keep SSH running on Windows 10 Bash

I am having a problem keeping SSH running on the Windows Subsystem for Linux. It seems that if a shell is not open and running bash, all processes in the subsystem are killed. Is there a way to stop this?
I have tried to create a service using nssm but have not be able to get it working. Now I am attempting to start a shell and then just send it to the background but I haven't quite figured out how.
You have to keep at least one bash console open in order for background tasks to keep running: As soon as you close your last open bash console, WSL tears-down all running processes.
And, yes, we're working on improving this scenario in the future ;)
Update 2018-02-06
In recent Windows 10 Insider builds, we added the ability to keep daemons and services running in the background, even if you close all your Linux consoles!
One remaining limitation with this scenario is that you do have to manually start your services (e.g. $ sudo service ssh start in Ubuntu), though we are investigating how we might be able to allow you to configure which daemons/services auto-start when you login to your machine. Updates to follow.
To maintain WSL processes, I place this file in C:\Users\USERNAME\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\wsl.vbs
set ws=wscript.createobject("wscript.shell")
ws.run "C:\Windows\System32\bash.exe -c 'sudo /etc/rc.local'",0
In /etc/rc.local I kick off some services and finally "sleep" to keep the whole thing running:
/usr/sbin/sshd
/usr/sbin/cron
#block on this line to keep WSL running
sleep 365d
In /etc/sudoers.d I added a 'rc-local' file to allow the above commands without a sudo password prompt:
username * = (root) NOPASSWD: /etc/rc.local
username * = (root) NOPASSWD: /usr/sbin/cron
username * = (root) NOPASSWD: /usr/sbin/sshd
This worked well on 1607 but after the update to 1704 I can no longer connect to wsl via ssh.
Once you have cron running you can use 'sudo crontab -e -u username' to define cron jobs with #reboot to launch at login.
Just read through this thread earlier today and used it to get sshd running without having a wsl console open.
I am on Windows 10 Version 1803 and using Ubuntu 16.04.5 LTS in WSL.
I needed to make a few changes to get it working. Many thanks to google search and communities like this.
I modified /etc/rc.local as such:
mkdir /var/run/sshd
/usr/sbin/sshd
#/usr/sbin/cron
I needed to add the directory for sshd or I would get an error "Missing privilege separation directory /var/run/sshd
I commented out cron because I was getting similar errors and haven't had the time or need yet to fix it.
I also changed the sudoers entries a little bit in order to get them to work:
username ALL = ....
Hope this is useful to someone.
John Butler

No lock file found in /usr/local/nagios/var/nagios.lock

I followed the instructions at this link which I found here on EE...http://nagios.sourceforge.net/docs/3_0/quickstart-fedora.html
Well after trying to stop nagios with command service nagios stop and after that to see its status with service nagios status the following message appears: "No lock file found in /usr/local/nagios/var/nagios.lock". How do I resolve it.
Thanks.
This is not a bug. "No lock file found in /usr/local/nagios/var/nagios.lock" means that it isn't running.
If you run an echo $? directly after service nagios status while it isn't running, you'll notice that the exit code is 3.
3 is the correct value return code for that status as documented in the Linux Standard Base.
Some Sources:
https://refspecs.linuxbase.org/LSB_3.0.0/LSB-PDA/LSB-PDA/iniscrptact.html
http://ftp.novell.hu/pub/mirrors/ftp.novell.com/forge/library/SUSE%20Package%20Conventions/spc_init_scripts.html
Just run:
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
UPDATE:
The error: "No lock file found in /usr/local/nagios/var/nagios.lock" simply means that nagios is not running.
Running the command above simply starts the nagios daemon and points it to a specific config file. The advantage to running this command manually over systemd is that when you run "service nagios start" this typically calls the /etc/rc.d/init.d/nagios script which contains a line with parametrized environment variables:
$NagiosBin -d $NagiosCfgFile
Because every system is different, not specifying either the bin nor config directories could lead to nagios breaking (stopping) when it tries to start using the default installation directory paths

jenkins build - using ssh agent module - not able to run full set of commands on target machine

I'm trying to use the ssh agent module in jenkins to log in as me onto a remote machine... and run some lxc commands.
I don't get any errors logging in but ... depending on the commands that I run, I'm either getting different results or permission denied errors.
This pastebin will show you the output I'm getting from Jenkins
http://pastebin.com/VxPHYzf9
In example 1, i'm trying to run an lxc command but it fails.
In example 2, i run rc-status, expecting to see a list of running containers but I get something entirely different.
What I've tried so far:
When I manually ssh into the lxc host as user "johndoe" I am able to run both commands no problem.
I also have "watched" a log file on lxc that shows the ssh finger print who's logging in... and I can see that when jenkins does the build, it's logging in as "johndoe".
I've created a "jenkins" user / ssh key pair and also tried using that. I get the same results.
The last test I tried which is not shown in the pastebin is just to run "ls -lah" on the tmp folder of the lxc host where I know I have some bash files.
When i change the build to run these commands:
whoami
ssh -AtT root#10.111.11.11
ls -lah /tmp/*.sh
I get these results which again, are completely wrong:
+ whoami
jenkins
+ ssh -AtT root#10.111.11.11
********************************* ATTENTION **********************************
* Unauthorized access prohibited. Activity is logged. *
******************************************************************************
15:13:25 up 23 days, 19:33, 0 users, load average: 0.11, 0.07, 0.11
+ ls -lah /tmp/hudson3065491328414451721.sh
-rw-r--r-- 1 jenkins jenkins 52 Jun 10 11:13 /tmp/hudson3065491328414451721.sh
[ssh-agent] Stopped.
Finished: SUCCESS
I have at least 3 sh files ... and I don't know what the hudson*.sh file is.
Ultimately, I need Jenkins to be able to log into the lxc host and run a bunch of commands as a part of my build.
Any help would be appreciated.
I'm also going to try to get the build to ssh into a different machine to see if I get different results.
I figured it out.
It looks like my mistake was putting all my shell commands for the remote server on separate lines.
So for example,in the build "Execute Shell" text box i had something like this:
ssh -A root#10.1.1.1
whoami
ls -lah /tmp/*.sh
What it was doing was logging into the remote server, and then running the other two commands on my LOCAL jenkins server.
So I changed the shell commands to look like this:
ssh- A root#10.1.1.1 'ls -lah'
and now it's working.

Using Psexec with Windows Server 2003

I'm trying to run a psexec command to a remote Windows Server 2003 machine. I run the following command:
psexec \machinename perfmon.msc -u machineadmin -p adminpassword -i -s
The -i and -s flags will allow me to run the GUI for perfmon.msc on the remote machine's desktop interactively.
I get the following error when I try to run the above command:
Couldn't Access machinename
Access denied
I'm using psexec version 1.94 and I'm certain that the machinename, user, and password are correct. Does anyone know if there are known issues with psexec on Windows Server 2003 and whether or not there is a fix?
[This question would be better fit for ServerFault.com, but nevertheless...]
A few suggestions:
Use two slashes before the machinename e.g. \\machinename (maybe that's what you meant the StackOverflow escaped the backslash)
*.MSC files are not usually directly executable remotely--you'll want to give the path to "c:\WINDOWS\system32\mmc.exe" and then the parameters
All parameters for psexec should go before the remote program and its parameters.
Is there really a reason to run the process as the System account ('-s') instead of just Administrator?
All together, it should look something like this:
psexec \\machinename -i -u machineadmin -p adminpassword "c:\WINDOWS\system32\mmc.exe" "perfmon.msc"
Are you connecting as an admin to the remote machine? The error says 'access denied'. You may not have the necessary privileges. Try connecting as an admin.

Resources