icinga2 monitor a linux service is running - linux

I have a server running Plex and two other services I want to monitor with Icinga2 and for the life of me I can't figure out how to get that to work. If I run the following command:
./check_procs -c 1:1 -a '/usr/lib/plexmediaserver/Plex Media Server'
Which returns the following when I manually kill Plex:
PROCS CRITICAL: 0 processes with args '/usr/lib/plexmediaserver/Plex Media Server' | procs=0;;1:1;0;
I just can't figure out how to add this check to the server.. where do I put it ?
I tried adding another declaration to /etc/icinga2/conf.d/services.conf as follows:
apply Service "procs"
{
import "generic-service"
check_command = "procs"
assign where host.name == NodeName
arguments =
{
"-a" =
{
value = "/usr/lib/plexmediaserver/Plex Media Server"
description = "service name"
required = true
}
}
}
But then the agent wouldn't start at all.

I solved this by defining a service:
apply Service for (service => config in host.vars.processes_linux) {
import "generic-service"
check_command = "nrpe"
display_name = config.display_name
vars.nrpe_command = "check_process"
vars.nrpe_arguments = [ config.process, config.warn_range, config.crit_range ]
}
In the host definition I then just add a config, let's say for mongodb:
vars.processes_linux["trench-srv-lin-process-mongodb"] = {
display_name = "MongoDB processes"
process = "mongod"
warn_range = "1:"
crit_range = "1:"
}
On the remote host, I need to install the package nagios-nrpe-server
And in the configfile /etc/nagios/nrpe_local.cfg I add this line:
command[check_procs]=/usr/lib/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$

i am running a small cluster of Raspberry Pi's which i am monitoring with Icinga2. On the master node of my cluster i have a dhcp server running. I check it's status the following way.
First i downloaded the check service status plugin from the Icinga Exchange, made it executable and moved it to /usr/lib/nagios/plugins (your path may differ).
Then i defined a check command for it:
object CheckCommand "Check Service" {
import "plugin-check-command"
command = [ PluginDir + "/check_service.sh" ]
arguments += {
"-o" = {
required = true
value = "$check_service_os$"
}
"-s" = {
required = true
value = "$check_service_name$"
}
}
}
Now all that was left was defining a Service:
object Service "Check DHCP" {
host_name = "Localhost"
check_command = "Check Service"
enable_perfdata = true
event_command = "Restart DHCP"
vars.check_service_name = "isc-dhcp-server"
vars.check_service_os = "linux"
}
As a Bonus you can even define a event command that restarts your service:
object EventCommand "Restart DHCP" {
import "plugin-event-command"
command = [ "/usr/bin/sudo", "systemctl", "restart" ]
arguments += {
"(no key)" = {
skip_key = true
value = "$check_service_name$"
}
}
vars.check_service_name = "isc-dhcp-server"
}
But for this to work, you have to give your nagios user (or whatever user runs your icinga service) sudo privileges to restart services. Add this line to your sudoers file:
nagios ALL = (ALL) NOPASSWD: /bin/systemctl restart *
I hope this helps you with your problem :-)
Jan

Related

In Terraform how to use a condition to only run on certain nodes?

Terraform v1.2.8
I have a generic script that executes the passed-in shell script on my AWS remote EC2 instance that I've created also in Terraform.
resource "null_resource" "generic_script" {
connection {
type = "ssh"
user = "ubuntu"
private_key = file(var.ssh_key_file)
host = var.ec2_pub_ip
}
provisioner "file" {
source = "../modules/k8s_installer/${var.shell_script}"
destination = "/tmp/${var.shell_script}"
}
provisioner "remote-exec" {
inline = [
"sudo chmod u+x /tmp/${var.shell_script}",
"sudo /tmp/${var.shell_script}"
]
}
}
Now I want to be able to modify it so it runs on
all nodes
this node but not that node
that node but not this node
So I created variables in the variables.tf file
variable "run_on_THIS_node" {
type = boolean
description = "Run script on THIS node"
default = false
}
variable "run_on_THAT_node" {
type = boolean
description = "Run script on THAT node"
default = false
}
How can I put a condition to achieve what I want to do?
resource "null_resource" "generic_script" {
count = ???
...
}
You could use the ternary operator for this. For example, based on the defined variables, the condition would look like:
resource "null_resource" "generic_script" {
count = (var.run_on_THIS_node || var.run_on_THAT_node) ? 1 : length(var.all_nodes) # or var.number_of_nodes
...
}
The piece of the puzzle that is missing is the variable (or a number) that would tell the script to run on all the nodes. It does not have to be with length function, you could define it as a number only. However, this is only a part of the code you would have to add/edit, as there would have to be a way to control the host based on the index. That means that you probably would have to modify var.ec2_pub_ip so that it is a list.

terraform provisioning locally cloudflared tunnel

I tried to use terraform without any Cloud instance - only for local install cloudflared tunnel using construction:
resource "null_resource" "tunell_install" {
triggers = {
always_run = timestamp()
}
provisioner "local-exec" {
command = "/home/uzer/script/tunnel.sh"
}
}
instead something like:
provider "google" {
project = var.gcp_project_id
}
but after running
$ terraform apply -auto-approve
successfully created /etc/cloudflared/cert.json with content:
{
"AccountTag" : "${account}",
"TunnelID" : "${tunnel_id}",
"TunnelName" : "${tunnel_name}",
"TunnelSecret" : "${secret}"
}
but as I undestood here must be values instead variables? It's seems that metadata_startup_script from instance.tf only applied to Google instance. How it's possible to change it for using terraform with install CF tunnel locally and running tunnel? Maybe also need to use templatefile but in other .tf file? The curent code block metadata_startup_script:
// This is where we configure the server (aka instance). Variables like web_zone take a terraform variable and provide it to the server so that it can use them as a local variable
metadata_startup_script = templatefile("./server.tpl",
{
web_zone = var.cloudflare_zone,
account = var.cloudflare_account_id,
tunnel_id = cloudflare_argo_tunnel.auto_tunnel.id,
tunnel_name = cloudflare_argo_tunnel.auto_tunnel.name,
secret = random_id.tunnel_secret.b64_std
})
Content of server.tpl file:
# Script to install Cloudflare Tunnel
# cloudflared configuration
cd
# The package for this OS is retrieved
wget https://bin.equinox.io/c/VdrWdbjqyF/cloudflared-stable-linux-amd64.deb
sudo dpkg -i cloudflared-stable-linux-amd64.deb
# A local user directory is first created before we can install the tunnel as a system service
mkdir ~/.cloudflared
touch ~/.cloudflared/cert.json
touch ~/.cloudflared/config.yml
# Another herefile is used to dynamically populate the JSON credentials file
cat > ~/.cloudflared/cert.json << "EOF"
{
"AccountTag" : "${account}",
"TunnelID" : "${tunnel_id}",
"TunnelName" : "${tunnel_name}",
"TunnelSecret" : "${secret}"
}
EOF
# Same concept with the Ingress Rules the tunnel will use
cat > ~/.cloudflared/config.yml << "EOF"
tunnel: ${tunnel_id}
credentials-file: /etc/cloudflared/cert.json
logfile: /var/log/cloudflared.log
loglevel: info
ingress:
- hostname: ssh.${web_zone}
service: ssh://localhost:22
- hostname: "*"
service: hello-world
EOF
# Now we install the tunnel as a systemd service
sudo cloudflared service install
# The credentials file does not get copied over so we'll do that manually
sudo cp -via ~/.cloudflared/cert.json /etc/cloudflared/
# Now we can start the tunnel
sudo service cloudflared start
In argo.tf exist this code:
data "template_file" "init" {
template = file("server.tpl")
vars = {
web_zone = var.cloudflare_zone,
account = var.cloudflare_account_id,
tunnel_id = cloudflare_argo_tunnel.auto_tunnel.id,
tunnel_name = cloudflare_argo_tunnel.auto_tunnel.name,
secret = random_id.tunnel_secret.b64_std
}
}
If you are asking about how to create the file locally and populate the values, here is an example:
resource "local_file" "cloudflare_tunnel_script" {
content = templatefile("${path.module}/server.tpl",
{
web_zone = "webzone"
account = "account"
tunnel_id = "id"
tunnel_name = "name"
secret = "secret"
}
)
filename = "${path.module}/server.sh"
}
For this to work, you would have to assign the real values for all the template variables listed above. From what I see, there are already examples of how to use variables for those values. In other words, instead of hardcoding the values for template variables you could use standard variables:
resource "local_file" "cloudflare_tunnel_script" {
content = templatefile("${path.module}/server.tpl",
{
web_zone = var.cloudflare_zone
account = var.cloudflare_account_id
tunnel_id = cloudflare_argo_tunnel.auto_tunnel.id
tunnel_name = cloudflare_argo_tunnel.auto_tunnel.name
secret = random_id.tunnel_secret.b64_std
}
)
filename = "${path.module}/server.sh"
}
This code will populate all the values and create a server.sh script in the same directory you are running the Terraform code from.
You could complement this code with the null_resource you wanted:
resource "null_resource" "tunnel_install" {
depends_on = [
local_file.cloudflare_tunnel_script,
]
triggers = {
always_run = timestamp()
}
provisioner "local-exec" {
command = "${path.module}/server.sh"
}
}

Deploying a self managed EKS cluster via Terraform

It's my first time doing this, and this is mostly a copy pasted beginner example. Not sure what I'm missing.
self_managed_node_group_defaults = {
disk_size = 50
}
self_managed_node_groups = {
bottlerocket = {
name = "bottlerocket-self-mng"
platform = "bottlerocket"
ami_id = "xxx"
instance_type = "t2.small"
desired_size = 2
iam_role_additional_policies = ["arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"]
pre_bootstrap_user_data = <<-EOT
echo "foo"
export FOO=bar
EOT
bootstrap_extra_args = "--kubelet-extra-args '--node-labels=node.kubernetes.io/lifecycle=spot'"
post_bootstrap_user_data = <<-EOT
cd /tmp
sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
sudo systemctl enable amazon-ssm-agent
sudo systemctl start amazon-ssm-agent
EOT
}
}
And the error it throws:
Error: Your query returned no results. Please change your search criteria and try again.
with module.eks.module.self_managed_node_group["bottlerocket"].data.aws_ami.eks_default[0]
on .terraform/modules/eks/modules/self-managed-node-group/main.tf line 5, in data "aws_ami" "eks_default":
data "aws_ami" "eks_default" {

Provision Digitalocean instance using Terraform

I am trying to provision a digital ocean droplet using Terraform. I appear to be missing the host argument in the connection block, but am not certain what value I need for digitalocean.
This is my configuration file:
resource "digitalocean_droplet" "test" {
image = "ubuntu-18-04-x64"
name = "test"
region = "nyc1"
size = "512mb"
private_networking = true
ssh_keys = [
"${var.ssh_fingerprint}"
]
connection {
user = "root"
type = "ssh"
private_key = "${file("~/.ssh/id_rsa")}"
timeout = "2m"
}
provisioner "remote-exec" {
inline = [
"export PATH=$PATH:/usr/bin",
# install nginx
"sudo apt-get update",
"sudo apt-get -y install nginx"
]
}
}
"terraform validate" gives me the error:
Error: Missing required argument
on frontend.tf line 11, in resource "digitalocean_droplet" "test":
11: connection {
The argument "host" is required, but no definition was found.
I fiddled around with this and found the answer.
In the connection block we should have the host as:
connection {
user = "root"
type = "ssh"
host = "${self.ipv4_address}"
private_key = "${file(var.pvt_key)}"
timeout = "2m"
}
You can explicitly reference the exported var:
connection {
user = "root"
host = "${digitalocean_droplet.test.ipv4_address}"
type = "ssh"
password = "${file(var.pvt_key)}"
}
I think there is a problem with your syntax.
Try to use like below:
private_key = file("/home/user/.ssh/id_rsa")
I'm using terraform version 0.12.25
Best of luck.

Terraform openstack instance doesn't return floating ip

Im setting up and openstack instance using terraform. Im writing to a file the ip returned but for some reason its alwayse empty (i have looked at the instance in openstack consol and everythign is correct with ip, securitygroups etc etc)
resource "openstack_compute_instance_v2" "my-deployment-web" {
count = "1"
name = "my-name-WEB"
flavor_name = "m1.medium"
image_name = "RHEL7Secretname"
security_groups = [
"our_security_group"]
key_pair = "our-keypair"
network {
name = "public"
}
metadata {
expire = "2",
owner = ""
}
connection {
type = "ssh"
user = "vagrant"
private_key = "config/vagrant_private.key"
agent = "false"
timeout = "15m"
}
##Create Ansible host in staging inventory
provisioner "local-exec" {
command = "echo -e '\n[web]\n${openstack_compute_instance_v2.my-deployment-web.network.0.floating_ip}' > ../ansible/inventories/staging/hosts"
interpreter = ["sh", "-c"]
}
}
The host file generated only gets [web] but no ip. Anyone know why?
[web]
Modifying the variable from
${openstack_compute_instance_v2.my-deployment-web.network.0.floating_ip}
to
${openstack_compute_instance_v2.my-deployment-web.network.0.access_ip_v4}
solved the problem. Thank you #Matt Schuchard

Resources