CHECK_GEARMAN CRITICAL - function 'BulkEmail' is not registered in the server - linux

I am using the nagios to monitor gearman and getting error "CRITICAL - function 'xxx' is not registered in the server"
Script that nagios execute to check the gearman is like
#!/usr/bin/env perl
# taken from: gearmand-0.24/libgearman-server/server.c:974
# function->function_name, function->job_total,
# function->job_running, function->worker_count);
#
# this code give following result with gearadmin --status
#
# FunctionName job_total job_running worker_count
# AdsUpdateCountersFunction 0 0 4
use strict;
use warnings;
use Nagios::Plugin;
my $VERSION="0.2.1";
my $np;
$np = Nagios::Plugin->new(usage => "Usage: %s -f|--flist <func1[:threshold1],..,funcN[:thresholdN]> [--host|-H <host>] [--port|-p <port>] [ -c|--critworkers=<threshold> ] [ -w|--warnworkers=<threshold>] [-?|--usage] [-V|--version] [-h|--help] [-v|--verbose] [-t|--timeout=<timeout>]",
version => $VERSION,
blurb => 'This plugin checks a gearman job server, expecting that every function in function-list arg is registered by at least one worker, and expecting that job_total is not too much high.',
license => "Brought to you AS IS, WITHOUT WARRANTY, under GPL. (C) Remi Paulmier <remi.paulmier\#gmail.com>",
shortname => "CHECK_GEARMAN",
);
$np->add_arg(spec => 'flist|f=s',
help => q(Check for the functions listed in STRING, separated by comma. If optional threshold is given (separated by :), check that waiting jobs for this particular function are not exceeding that value),
required => 1,
);
$np->add_arg(spec => 'host|H=s',
help => q(Check the host indicated in STRING),
required => 0,
default => 'localhost',
);
$np->add_arg(spec => 'port|p=i',
help => q(Use the TCP port indicated in INTEGER),
required => 0,
default => 4730,
);
$np->add_arg(spec => 'critworkers|c=i',
help => q(Exit with CRITICAL status if fewer than INTEGER workers have registered a particular function),
required => 0,
default => 1,
);
$np->add_arg(spec => 'warnworkers|w=i',
help => q(Exit with WARNING status if fewer than INTEGER workers have registered a particular function),
required => 0,
default => 4,
);
$np->getopts;
my $ng = $np->opts;
# manage timeout
alarm $ng->timeout;
my $runtime = {'status' => OK,
'message' => "Everything OK",
};
# host & port
my $host = $ng->get('host');
my $port = $ng->get('port');
# verbosity
my $verbose = $ng->get('verbose');# look for gearadmin, use nc if not found
my #paths = grep { -x "$_/gearadmin" } split /:/, $ENV{PATH};
my $cmd = "gearadmin --status -h $host -p $port";
if (#paths == 0) {
print STDERR "gearadmin not found, using nc\n" if ($verbose != 0);
# $cmd = "echo status | nc -w 1 $host $port";
$cmd = "echo status | nc -i 1 -w 1 $host $port";
}
foreach (`$cmd 2>/dev/null | grep -v '^\\.'`) {
chomp;
my ($fname, $job_total, $job_running, $worker_count) =
split /[[:space:]]+/;
$runtime->{'funcs'}{"$fname"} = {job_total => $job_total,
job_running => $job_running,
worker_count => $worker_count };
# print "$fname : $runtime->{'funcs'}{\"$fname\"}{'worker_count'}\n";
}
# get function list
my #flist = split /,/, $ng->get('flist');
foreach (#flist) {
my ($fname, $fthreshold);
if (/\:/) {
($fname, $fthreshold) = split /:/;
} else {
($fname, $fthreshold) = ($_, -1);
}
# print "defined for $fname: $runtime->{'funcs'}{\"$fname\"}{'worker_count'}\n";
# if (defined($runtime->{'funcs'}{"$fname"})) {
# print "$fname is defined\n";
# } else {
# print "$fname is NOT defined\n";
# }
if (!defined($runtime->{'funcs'}{"$fname"}) &&
$runtime->{'status'} <= CRITICAL) {
($runtime->{'status'}, $runtime->{'message'}) =
(CRITICAL, "function '$fname' is not registered in the server");
} else {
if ($runtime->{'funcs'}{"$fname"}{'worker_count'} <
$ng->get('critworkers') && $runtime->{'status'} <= CRITICAL) {
($runtime->{'status'}, $runtime->{'message'}) =
(CRITICAL,
"less than " .$ng->get('critworkers').
" workers were found having function '$fname' registered.");
}
if ($runtime->{'funcs'}{"$fname"}{'worker_count'} <
$ng->get('warnworkers') && $runtime->{'status'} <= WARNING) {
($runtime->{'status'}, $runtime->{'message'}) =
(WARNING,
"less than " .$ng->get('warnworkers').
" workers were found having function '$fname' registered.");
}
if ($runtime->{'funcs'}{"$fname"}{'job_total'} > $fthreshold
&& $fthreshold != -1 && $runtime->{'status'}<=WARNING) {
($runtime->{'status'}, $runtime->{'message'}) =
(WARNING,
$runtime->{'funcs'}{"$fname"}{'job_total'}.
" jobs for $fname exceeds threshold $fthreshold");
}
}
}
$np->nagios_exit($runtime->{'status'}, $runtime->{'message'});
When the script is executed simply by command line it says "everything ok"
But in nagios it shows error "CRITICAL - function 'xxx' is not registered in the server"
Thanks in advance

After spending long time on this, finally got the answer all that have to do is.
yum install nc
nc is what that was missing from the system.
With Regards,
Bankat Vikhe

Not easy to say but it could be related to your script not being executable as embedded Perl.
Try with # nagios: -epn at the beginning of the script.
#!/usr/bin/env perl
# nagios: -epn
use strict;
use warnings;
Be sure to check all the hints in the Perl Plugins section of the Nagios Plugin Development Guidelines

Related

Can I avoid this subshell in a POSIX sh script?

I am trying to comprehend how, if even it can be done, can I avoid subshell?
Is this the only way the code can be written or is there another way?
I tried to use braces { ... }, but it won't pass shellcheck and won't run.
is_running_interactively ()
# test if file descriptor 0 = standard input is connected to the terminal
{
[ -t 0 ]
}
is_tput_available ()
# check if tput coloring is available
{
command -v tput > /dev/null 2>&1 &&
tput bold > /dev/null 2>&1 &&
tput setaf 1 > /dev/null 2>&1
}
some_other_function ()
# so far unfinished function
{
# is this a subshell? if so, can I avoid it somehow?
( is_running_interactively && is_tput_available ) || # <-- HERE
{
printf '%b' "${2}"
return
}
...
}
It is a compound-list, and yes those commands are run in a subshell. To avoid it, use curly braces instead of parentheses:
{ is_running_interactively && is_tput_available; } || ...

Snoopy module in Puppet returns "Could not find declared class snoopy::install"

I am getting an error on the snoopy module. When I run it on my client, I am getting this error:
"Error: Could not retrieve catalog from remote server: Error 400 on SERVER:
Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class snoopy::install at /etc/puppet/modules/snoopy/manifests/init.pp:22 on node <hostname>
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run"
Any ideas what I am doing wrong here? This is the snoopy module from git with a couple of modifications for our environment. https://forge.puppet.com/revolutionsystem/snoopy
install.pp
$ cat install.pp
# snoopy::install
#
# A description of what this class does
#
# #summary A short summary of the purpose of this class
#
# #example
# include snoopy::install
class snoopy::install {
# Download snoopy installation script
file { '/tmp/snoopy':
ensure => directory,
owner => 'root',
group => 'root',
mode => '0755',
} ->
exec { 'wget installer':
command => "/usr/bin/wget http://159.79.213.28/pub/PUPPET/snoopy- install.sh",
creates => "/tmp/snoopy/snoopy-install.sh",
require => [ File['/tmp/snoopy'], ];
} ->
# Install Snoopy stable version
exec { '/tmp/snoopy/snoopy-install.sh stable':
cwd => '/tmp/snoopy',
command => '',
path => [ '/bin/bash' ],
unless => [ 'test -f /tmp/snoopy/snoopy-install.sh']
require => [ File['/tmp/snoopy'], File['/tmp/snoopy/snoopy- install.sh'], ];
}
}'
$ cat init.pp
# snoopy
#
# A description of what this class does
#
# #summary A short summary of the purpose of this class
#
# #example
# include snoopy
class snoopy (
$user_name = $::snoopy::params::username,
$user_id = $::snoopy::params::userid,
$group_id = $::snoopy::params::groupid,
$super_id = $::snoopy::params::superid,
$terminal = $::snoopy::params::terminal,
$current_directory = $::snoopy::params::currentdirectory,
$process_id = $::snoopy::params::processid,
$file_name = $::snoopy::params::filename,
$log_file = $::snoopy::params::logfile,
$log_path = $::snoopy::params::logpath,
$date_time = $::snoopy::params::datetime
) inherits snoopy::params {
class { 'snoopy::install': }
class { 'snoopy::configure':
username => $user_name,
userid => $user_id,
groupid => $group_id,
superid => $super_id,
terminal => $terminal,
currentdirectory => $current_directory,
processid => $process_id,
filename => $file_name,
logfile => $log_file,
logpath => $log_path,
datetime => $date_time
}
}
This can bel closed, it is related to the version of puppet agent we are running. Module required version >= 4.0.0 and <= 6.0.0.
We are running 3.8.x.

Expect ignoring pattern matching and not exiting

I'm new using expect and is puzzling me big time. It works perfectly with one pattern but when the second case comes up it just ignores the exit completely. First, this is my code.
#!/usr/bin/expect
#Usage migration_test.xpct <ssh_password> <vmname> <no_migraciones>
set timest [ timestamp -format %Y-%m-%d_%H-%M ]
set vmname [lindex $argv 1]
log_file migtest_${vmname}_${timest}.log ;
set password [lindex $argv 0]
set num [lindex $argv 2]
set failureMsg "Status: Failure\n\r"
set timeout 60
spawn ssh admin#localhost -p 10000
expect "yes/no" {
send "yes\r"
expect "*?assword" { send "$password\r" }
} "*?assword" { send "$password\r" }
for {set i 0} {$i < $num} {incr i 1} {
expect "OVM> " {
send "show Vm name=$vmname\r"
expect {
$failureMsg { }
-re "Status = Running\n\r" {
exp_continue
}
-re "Server = .*? \\\[(.*?)(1|2)?\\\]\n\r" {
set destserver $expect_out(2,string);
if { $destserver == 1 } {
send_user "\n\nMIGRATION [ expr $i+1 ] of $num\n\n"
send "migrate Vm name=$vmname destServer=serv_prod02\r"
expect {
-re "JobId: (.*?)\n\r" {
set jobid $expect_out(1,string);
send "show Job id=$jobid\r";
expect {
-re "Command:(.*?)\n\r" { send_user "\n\nWaiting 30secs before next migration\n\n";
sleep 30; }
}
}
-re "Status: Failure\n\r" { send_user "\n\nExiting\n"; exit 1 }
}
} else {
send_user "\n\nMIGRATION [expr $i+1] of $num\n\n"
send "migrate Vm name=$vmname destServer=serv_prod01\r"
expect {
-re "JobId: (.*?)\n\r" {
set jobid $expect_out(1,string);
send "show Job id=$jobid\r";
expect {
-re "Command:(.*?)\n\r" { send_user "\n\nWaiting 30secs before next migration\n\n";
sleep 30; }
}
}
-re "Status: Failure\n\r" { send_user "\n\nExiting\n"; exit 1 }
}
}
}
}
}
}
send "exit\r"
expect eof
The problem comes when it reaches the "migrate vm" section. That's a job I'm sending to a CLI (oracle ovm cli to be precise) and the job can either fail or success. I want to print the job details when it success but finish the entire execution if the job fails (since it already shows the reason and I don't have to expand the job details).
Here is how the output of a successful job looks:
MIGRATION 5 of 12
migrate Vm name=slestest_temp_share_vm destServer=serv_prod01
Command: migrate Vm name=slestest_temp_share_vm
destServer=serv_prod01
Status: Success
Time: 2016-04-13 10:45:24,174
JobId: 12345678978
OVM> show Job id=12345678978
Command: show Job id=12345678978
Status: Success Time: 2016-04-13 10:45:24,188
Data:
Run State = Success
Summary State = Success
Done = Yes
Summary Done = Yes
Job Group = No
Username = admin
Creation Time = Apr 13, 2016 10:44:45 am
Start Time = Apr 13, 201 10:44:45 am
End Time = Apr 13, 2016 10:45:23 am
Duration = 37s
Id = 12345678978 [Migrate Vm: slestest_temp_share_vm to Server: serv_prod01]
Name = Migrate Vm: slestest_temp_share_vm to Server:serv_prod01
Description = Migrate Vm: slestest_temp_share_vm to
Server: serv_prod01 Locked = false
OVM>
Waiting 30secs before next migration
And here is how a failured job looks like:
MIGRATION 4 of 12
migrate Vm name=slestest_temp_share_vm destServer=serv_prod01
Command: migrate Vm name=slestest_temp_share_vm destServer=serv_prod01
Status: Failure
Time: 2016-04-13 11:31:08,819
JobId: 1460564963372
Error Msg: Job failed on Core: OVMAPI_5001E Job: 1460564963372/Migrate Vm: slestest_temp_share_vm to Server: serv_prod01/Migrate Vm: slestest_temp_share_vm serv_prod01, failed. Job Failure Event: 1460565064570/Server Async Command Failed/OVMEVT_00C014D_001 Async command failed serv_prod02. Object: slestest_temp_share_vm, PID: 1724,
Server error: Command: ['xm', 'migrate', '--live', '0004fb00000600009f354416bab38df6', '8.8.8.1'] failed (1): stderr: Error: ti
stdout: Usage: xm migrate
Migrate a domain to another machine.
Options:
-h, --help Print this help.
-l, --live Use live migration.
-p=portnum, --port=portnum
Use specified port for migration.
-n=nodenum, --node=nodenum
Use specified NUMA node on target.
-s, --ssl Use ssl connection for migration.
-c, --change_home_server
Change home server for managed domains.
, on server: serv_prod02, associated with object: 0004fb00000600009f354416bab38df6 [Wed Apr 13 11:31:04 2016]
Why does the Status: Failure is ignored? Also, when that happens it seems it jumps an iteration of the loop, if it was in the 5th it then shows "Migration 7 of 12" for example.
Thanks everyone
I can suggest two things, one you can rewrite code to avoid duplicacy. Second, I think you are matching for both \n\r at the end of pattern. Try with \n alone or use \n?\r? which will match zero, one, or both line endings.
-re "Server = .*? \\\[(.*?)(1|2)?\\\]\n" {
set destserver $expect_out(2,string);
send_user "\n\nMIGRATION [ expr $i+1 ] of $num\n\n"
if { $destserver == 1 } {
send "migrate Vm name=$vmname destServer=serv_prod02\r"
} else {
send "migrate Vm name=$vmname destServer=serv_prod01\r"
}
expect {
-re "JobId: (.*?)\n" {
set jobid $expect_out(1,string);
send "show Job id=$jobid\r";
expect {
-re "Command:(.*?)$" {
send_user "\n\nWaiting 30secs before next migration\n\n";
sleep 30;
}
}
}
-re "Status: Failure\n" { send_user "\n\nExiting\n"; exit 1 }
}
}
Well, after some tests I found the problem. It seems I didn't understand how the timeout worked in expect. Every time a failured migration was performed it exceeded the timeout.
This wasn't evident for me because, although the timeout was exceeded, the script still kept waiting for the answer and printed it anyways, just none of the patterns I was expecting to get were being checked.
The solution was either use the "timeout" command or set it higher. I did the later and everything is running fine now.

Puppet test and remove an array of files/folders

I'm looking to make the following code work somehow, it seems if i do not test the files/folders first I end up with the error:
Error: Failed to apply catalog: Parameter path failed on
File[/opt/dynatrace-6.2]: File paths must be fully qualified, not
'["/opt/dynatrace-6.2", "/opt/dynatrace-5.6.0",
"/opt/rh/httpd24/root/etc/httpd/conf.d/dtload.conf",
"/opt/rh/httpd24/root/etc/httpd/conf.d/01_dtagent.conf"]' at
newrelic.pp:35
The pertinent parts
$dtdeps = [
"/opt/dynatrace-6.2",
"/opt/dynatrace-5.6.0",
"${httpd_root}/conf.d/dtload.conf",
"${httpd_root}/conf.d/01_dtagent.conf",
]
exec { "check_presence":
require => File[$dtdeps],
command => '/bin/true',
onlyif => "/usr/bin/test -e $dtdeps",
}
file { $dtdeps:
require => Exec["check_presence"],
path => $dtdeps,
ensure => absent,
recurse => true,
purge => true,
force => true,
} ## this is line 35 btw
exec { "stop_dt_agent":
command => "PID=$(ps ax |grep dtwsagent |grep -v grep |awk '{print$1}') ; [ ! -z $PID ] && kill -9 $PID",
provider => shell,
}
service { "httpd_restart" :
ensure => running,
enable => true,
restart => "/usr/sbin/apachectl configtest && /etc/init.d/httpd reload",
subscribe => Package["httpd"],
}
Your code looks basically correct, but you went overboard with your file resources:
file { $dtdeps:
require => Exec["check_presence"],
path => $dtdeps,
...
This does create all the file resources from your array (since you use an array for the resource title) but each single one of them will then try to use the same array as the path value, which does not make sense.
TL;DR remove the path parameter and it should Just Work.
You can actually simplify this down a lot. Puppet only runs the file removal if the files don't exist, so the check_presence exec is not required.
You can't give a path an array, but you can pass the title as an array and then the paths get automatically made.
$dtdeps = [
"/opt/dynatrace-6.2",
"/opt/dynatrace-5.6.0",
"${httpd_root}/conf.d/dtload.conf",
"${httpd_root}/conf.d/01_dtagent.conf",
]
file { $dtdeps:
ensure => absent,
recurse => true,
purge => true,
force => true,
}
exec { "stop_dt_agent":
command => '[ ! -z $PID ] && kill -9 $PID',
environment => ["PID=\$(ps ax |grep dtwsagent |grep -v grep |awk '{print$1}'))"],
provider => shell,
}
However, running the stop_dt_agent exec is a bit fragile. You could probably refactor this into a service resource instead:
service { 'dynatrace':
ensure => stopped,
provider => 'base',
stop => 'kill -TERM $(ps ax | grep \"dtwsagent\"|grep -v grep|awk '{print \$1}')',
status => "ps ax | grep "dtwsagent"",
}

puppet service wont start

I have the following code for a small class for Tomcat, The class runs fine but the service at the end of the script doesn't start. any guidance would be great. i dont know why the service wont start.
content of the Tomcat6 class
# Class: tomcat6
#
# This module manages tomcat6
#
# Parameters: none
#
# Actions:
#
# Requires: see Modulefile
#
# Sample Usage:
#
class tomcat6 ( $parentdir = '/usr/share',
$tomcat_version = '6.0.37',
$tomcat_major_version = '6',
$digest_string = '171d255cd60894b29a41684ce0ff93a8',
$tomcat_exe = 'tomcat6/tomcat.erb',
$java_home = '/usr/java/latest',
$jvm_route = 'jvm1',
$shutdown_password = 'SHUTDOWN',
$admin_port = 8005,
$http_port = 8080,
$tomcat_user = 'root',
$tomcat_group = 'root',
$admin_user = 'tomcat',
$admin_password = 'tomcat'
) {
$basedir = "${parentdir}/apache-tomcat-6.0.37"
file {'/installs':
ensure => 'directory',
source => 'puppet:///modules/tomcat6/',
recurse => 'remote',
owner => 'root',
group => 'root',
mode => '0755',
}
exec { 'tomcat_untar':
command => 'tar -zxvf /installs/apache-tomcat-6.0.37.tar.gz -C /usr/share/',
cwd => '/usr/share/',
creates => "/usr/share/apache-tomcat-6.0.37",
path => ["/bin"],
require => [File["/installs"]]
}
file { "/etc/init.d/tomcat":
ensure => present,
owner => root,
group => root,
mode => 0755,
content => template($tomcat_exe),
require => Exec["tomcat_untar"]
}
service { "tomcat":
ensure => "running",
enable => "true",
require => File["/etc/init.d/tomcat"]
}
}
contents of tomcat.erb
#!/bin/bash
# description: Tomcat Start Stop Restart
# processname: tomcat
# chkconfig: 234 20 80
JAVA_HOME=/usr/java/jdk1.6.0_26
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH
CATALINA_HOME=/usr/share/apache-tomcat-6.0.37
case $1 in
start)
sh $CATALINA_HOME/bin/startup.sh
;;
stop)
sh $CATALINA_HOME/bin/shutdown.sh
;;
restart)
sh $CATALINA_HOME/bin/shutdown.sh
sh $CATALINA_HOME/bin/startup.sh
;;
esac
exit 0
Puppet does an /etc/init.d/tomcat status to see if the service is running. If it gets no proper return status if the service is stopped, it will not try to start it.
http://docs.puppetlabs.com/references/latest/type.html#service
Check out hasstatus.
An alternative is to make Puppet grep through the process table with 'pattern' and 'hasstatus' set to false.

Resources