Ansible reboot not rebooting even though '/var/run/reboot-required' exists

Ansible reboot not rebooting even though '/var/run/reboot-required' exists - linux

I have tested this playbook with updating so I know that the credentials work, as well as the elevation to sudo. I have a test server with an extant /var/run/reboot-required file. I cannot get my ansible playbook to reboot the server though. This is an Ubuntu server. Playbook currently:
---
- hosts: server
vars:
ansible_user: sudo_user
ansible_password: "password"
become: yes
become_user: sudo_user
tasks:
- name: Check if reboot required
stat:
path: /var/run/reboot-required
register: reboot_required_file
- name: Reboot if required
reboot:
when: reboot_required_file.stat.exists == true
Ive tried variations of this playbook and I cant get the playbook to reboot the server. Playbook returns:
PLAY [server] *******************************************************************************************************************************************************************
TASK [Gathering Facts] **********************************************************************************************************************************************************
ok: [server]
PLAY RECAP **********************************************************************************************************************************************************************
server : ok=1 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
Ive also tried just doing a shell command:
- name:
shell: if [ -f /var/run/reboot-required ]; then init 6; else wall "reboot not required"; fi
ignore_errors: true
This also doesnt work.
Cheers

You may change the alignment of the tasks in the playbook. I performed the operation successfully by running the following playbook:
---
- hosts: nodes
become: yes
become_user: sudo_user
tasks:
- name: Check if reboot required
stat:
path: /var/run/reboot-required
register: reboot_required_file
- name: Diplay variable
debug:
msg: '{{reboot_required_file.stat.exists}}'
- name: Reboot if required
reboot:
when: reboot_required_file.stat.exists == true
For the password set you may run the playbook as following and put the password in prompt:
ansible-playbook playbook.yml – ask-become-pass

Related

How to apply the changes of a Linux users group assignments inside a local ansible playbook?

I´m trying to install docker and create a docker image within a local ansible playbook containing multiple plays, adding the user to docker group in between:
- hosts: localhost
connection: local
become: yes
gather_facts: no
tasks:
- name: install docker
ansible.builtin.apt:
update_cache: yes
pkg:
- docker.io
- python3-docker
- name: Add current user to docker group
ansible.builtin.user:
name: "{{ lookup('env', 'USER') }}"
append: yes
groups: docker
- name: Ensure that docker service is running
ansible.builtin.service:
name: docker
state: started
- hosts: localhost
connection: local
gather_facts: no
tasks:
- name: Create docker container
community.docker.docker_container:
image: ...
name: ...
When executing this playbook with ansible-playbook I´m getting a permission denied error at the "Create docker container" task. Rebooting and calling the playbook again solves the error.
I have tried manually executing some of the commands suggested here and executing the playbook again which works, but I´d like to do everything from within the playbook.
Adding a task like
- name: allow user changes to take effect
ansible.builtin.shell:
cmd: exec sg docker newgrp `id -gn`
does not work.
How can I refresh the Linux user group assignments from within the playbook?
I´m on Ubuntu 18.04.

Error running playbook that only affects one of the hosts

I've recently started using more and more Ansible, and especially AWX, for simple repetitive tasks. Below is a playbook for downloading, installing and configuring logging via a Bash script. The script is for two hosts: Ubuntu 20.04 and CentOS 7.6, and for the latter, making some changes to SELinux is required.
The question is, why am I getting an error for the Ubuntu only and not the CentOS also?
Here is the playbook:
# Download an run Nagios Log Server configuration script
---
- name: nagios-log configure
hosts: all
remote_user: root
tasks:
- name: Distribution
debug: msg="{{ ansible_distribution }}"
- name: Download setup-linux.sh
get_url:
url: http://10.10.10.10/nagioslogserver/scripts/setup-linux.sh
validate_certs: no
dest: /tmp/setup-linux.sh
- name: Change script permission
file: dest=/tmp/setup-linux.sh mode=a+x
- name: Run setup-linux.sh
shell: /tmp/setup-linux.sh -s 10.10.10.10 -p 5544
register: ps
failed_when: "ps.rc not in [ 0, 1 ]"
- name: Install policycoreutils if needed
yum:
name:
- policycoreutils
- policycoreutils-python
state: latest
when: ansible_distribution == 'CentOS'
- name: Check if policy file exists
stat:
path: /etc/selinux/targeted/active/ports.local
register: result
when: ansible_distribution == 'CentOS'
- name: Check whether line exists
find:
paths: /etc/selinux/targeted/active/ports.local
contains: '5544'
register: found
when: result.stat.exists == True
- name: Add SELinux policy exception if missing
command: semanage port -a -t syslogd_port_t -p udp 5544
when: found.matched > 0
- name: Restart rsyslog
systemd:
name: rsyslog
state: restarted
enabled: yes
And here is the error output when running the playbook on AWX:
TASK [Check whether line exists] ***********************************************
fatal: [Ubuntu.domain.corp]: FAILED! => {"msg": "The conditional check 'result.stat.exists == True' failed. The error was: error while evaluating conditional (result.stat.exists == True): 'dict object' has no attribute 'stat'\n\nThe error appears to be in '/tmp/awx_154_1811rny6/project/nagios-log.yml': line 39, column 5, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n - name: Check whether line exists\n ^ here\n"}
ok: [Centos.domain.corp]
For reasons I can't comprehend, the CentOS server is fine, but the Ubuntu is getting a strange error that I don't understand. I've tried other methods to achieve the same logic as the when command.

You get this error, because you register the variable result in
- name: Check if policy file exists
stat:
path: /etc/selinux/targeted/active/ports.local
register: result
when: ansible_distribution == 'CentOS'
But because of when: ansible_distribution == 'CentOS' this does not run on Ubuntu and therefor the variable result does not exist when running the playbook on Ubuntu.
To fix this (and run the task using result on CentOS only as well) you can change it to this:
- name: Check whether line exists
find:
paths: /etc/selinux/targeted/active/ports.local
contains: '5544'
register: found
when:
- ansible_distribution == 'CentOS'
- result.stat.exists == True
- name: Add SELinux policy exception if missing
command: semanage port -a -t syslogd_port_t -p udp 5544
when:
- ansible_distribution == 'CentOS'
- found.matched > 0
Or you can put all CentOS specific tasks in a block like this:
- name: CentOS specific tasks
block:
- name: Install policycoreutils if needed
yum:
name:
- policycoreutils
- policycoreutils-python
state: latest
- name: Check if policy file exists
stat:
path: /etc/selinux/targeted/active/ports.local
register: result
- name: Check whether line exists
find:
paths: /etc/selinux/targeted/active/ports.local
contains: '5544'
register: found
when: result.stat.exists == True
- name: Add SELinux policy exception if missing
command: semanage port -a -t syslogd_port_t -p udp 5544
when: found.matched > 0
when: ansible_distribution == 'CentOS'
Or you can put them in their own file and include that file. There are actually a lot of ways to do this.

Ansible - List of Linux security updates needed on remote servers

I wanted to run a playbook that will accurately report if one of the remote servers requires security updates. Ansible server = Centos 7, remote servers Amazon Linux.
Remote server would highlight on startup something like below:
https://aws.amazon.com/amazon-linux-2/
8 package(s) needed for security, out of 46 available
Run "sudo yum update" to apply all updates.
To confirm this, I put a playbook together, cobbled from many sources (below) that does perform that function to a degree. It does suggest whether the remote server requires security updates but doesn't say what these updates are?
- name: check if security updates are needed
hosts: elk
tasks:
- name: check yum security updates
shell: "yum updateinfo list all security"
changed_when: false
register: security_update
- debug: msg="Security update required"
when: security_update.stdout != "0"
- name: list some packages
yum: list=available
Then, when I run my updates install playbook:
- hosts: elk
remote_user: ansadm
become: yes
become_method: sudo
tasks:
- name: Move repos from backup to yum.repos.d
shell: mv -f /backup/* /etc/yum.repos.d/
register: shell_result
failed_when: '"No such file or directory" in shell_result.stderr_lines'
- name: Remove redhat.repo
shell: rm -f /etc/yum.repos.d/redhat.repo
register: shell_result
failed_when: '"No such file or directory" in shell_result.stderr_lines'
- name: add line to yum.conf
lineinfile:
dest: /etc/yum.conf
line: exclude=kernel* redhat-release*
state: present
create: yes
- name: yum clean
shell: yum make-cache
register: shell_result
failed_when: '"There are no enabled repos" in shell_result.stderr_lines'
- name: install all security patches
yum:
name: '*'
state: latest
security: yes
bugfix: yes
skip_broken: yes
After install, you would get something similar to below (btw - these are outputs from different servers)
https://aws.amazon.com/amazon-linux-2/
No packages needed for security; 37 packages available
Run "sudo yum update" to apply all updates.
But if I run my list security updates playbook again - it gives a false positive as it still reports security updates needed?
PLAY [check if security updates are needed] ************************************
TASK [Gathering Facts] *********************************************************
ok: [10.10.10.192]
TASK [check yum security updates] **********************************************
ok: [10.10.10.192]
TASK [debug] *******************************************************************
ok: [10.10.10.192] => {
"msg": "Security update required"
}
TASK [list some packages] ******************************************************
ok: [10.10.10.192]
PLAY RECAP *********************************************************************
10.10.10.192 : ok=4 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
[ansadm#ansible playbooks]$
What do I need to omit/include in playbook to reflect the changes after the install of the updates?
Thanks in advance :)

So I ran your yum command locally on my system I get the following.
45) local-user#server:/home/local-user> yum updateinfo list all security
Loaded plugins: ulninfo
local_repo | 2.9 kB 00:00:00
updateinfo list done
Now granted our systems may have different output here, but it will serve the purpose of my explanation. The output of the entire command is saved to your register, but your when conditional says to run when the output of that command is not EXACTLY "0".
So unless you par that response down with some awk's or sed's, and it responds with any more text that literally just the character "0" that debug task is always going to fire off.

Ansible file copy with sudo fails after upgrading to 1.9

In a playbook, I copy files using sudo. It used to work... Until we migrated to Ansible 1.9... Since then, it fails with the following error message:
"ssh connection closed waiting for sudo password prompt"
I provide the ssh and sudo passwords (through the Ansible prompt), and all the other commands running through sudo are successful (only the file copy and template fail).
My command is:
ansible-playbook -k --ask-become-pass --limit=testhost -C -D playbooks/debug.yml
and the playbookd contains:
- hosts: designsync
gather_facts: yes
tasks:
- name: Make sure the syncmgr home folder exists
action: file path=/home/syncmgr owner=syncmgr group=syncmgr mode=0755 state=directory
sudo: yes
sudo_user: syncmgr
- name: Copy .cshrc file
action: copy src=roles/designsync/files/syncmgr.cshrc dest=/home/syncmgr/.cshrc owner=syncmgr group=syncmgr mode=0755
sudo: yes
sudo_user: syncmgr
Is this a bug or did I miss something?
François.

Your playbook should look like:
- hosts: designsync
gather_facts: yes
tasks:
- name: Make sure the syncmgr home folder exists
sudo: yes
sudo_user: syncmgr
file:
path: "/home/syncmgr"
owner: syncmgr
group: syncmgr
mode: 0755
state: directory
- name: Copy .cshrc file
sudo: yes
sudo_user: syncmgr
copy:
src: "roles/designsync/files/syncmgr.cshrc"
dest: "/home/syncmgr/.cshrc"
owner: syncmgr
group: syncmgr
mode: 0755

Depending on the exact version of Ansible you're using, there may be a bug with sudo_user (experienced it myself).
Trying changing your playbooks from "sudo_user" to "remote_user".

How to wait for server restart using Ansible?

I'm trying to restart the server and then wait, using this:
- name: Restart server
shell: reboot
- name: Wait for server to restart
wait_for:
port=22
delay=1
timeout=300
But I get this error:
TASK: [iptables | Wait for server to restart] *********************************
fatal: [example.com] => failed to transfer file to /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for:
sftp> put /tmp/tmpApPR8k /root/.ansible/tmp/ansible-tmp-1401138291.69-222045017562709/wait_for
Connected to example.com.
Connection closed

Ansible >= 2.7 (released in Oct 2018)
Use the built-in reboot module:
- name: Wait for server to restart
reboot:
reboot_timeout: 3600
Ansible < 2.7
Restart as a task
- name: restart server
shell: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
async: 1
poll: 0
become: true
This runs the shell command as an asynchronous task, so Ansible will not wait for end of the command. Usually async param gives maximum time for the task but as poll is set to 0, Ansible will never poll if the command has finished - it will make this command a "fire and forget". Sleeps before and after shutdown are to prevent breaking the SSH connection during restart while Ansible is still connected to your remote host.
Wait as a task
You could just use:
- name: Wait for server to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=10
become: false
..but you may prefer to use {{ ansible_ssh_host }} variable as the hostname and/or {{ ansible_ssh_port }} as the SSH host and port if you use entries like:
hostname ansible_ssh_host=some.other.name.com ansible_ssh_port=2222
..in your inventory (Ansible hosts file).
This will run the wait_for task on the machine running Ansible. This task will wait for port 22 to become open on your remote host, starting after 10 seconds delay.
Restart and wait as handlers
But I suggest to use both of these as handlers, not tasks.
There are 2 main reason to do this:
code reuse - you can use a handler for many tasks. Example: trigger server restart after changing the timezone and after changing the kernel,
trigger only once - if you use a handler for a few tasks, and more than 1 of them will make some change => trigger the handler, then the thing that handler does will happen only once. Example: if you have a httpd restart handler attached to httpd config change and SSL certificate update, then in case both config and SSL certificate changes httpd will be restarted only once.
Read more about handlers here.
Restarting and waiting for the restart as handlers:
handlers:
- name: Restart server
command: 'sleep 1 && shutdown -r now "Reboot triggered by Ansible" && sleep 1'
async: 1
poll: 0
ignore_errors: true
become: true
- name: Wait for server to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=10
become: false
..and use it in your task in a sequence, like this, here paired with rebooting the server handler:
tasks:
- name: Set hostname
hostname: name=somename
notify:
- Restart server
- Wait for server to restart
Note that handlers are run in the order they are defined, not the order they are listed in notify!

You should change the wait_for task to run as local_action, and specify the host you're waiting for. For example:
- name: Wait for server to restart
local_action:
module: wait_for
host=192.168.50.4
port=22
delay=1
timeout=300

Most reliable I've with 1.9.4 got is (this is updated, original version is at the bottom):
- name: Example ansible play that requires reboot
sudo: yes
gather_facts: no
hosts:
- myhosts
tasks:
- name: example task that requires reboot
yum: name=* state=latest
notify: reboot sequence
handlers:
- name: reboot sequence
changed_when: "true"
debug: msg='trigger machine reboot sequence'
notify:
- get current time
- reboot system
- waiting for server to come back
- verify a reboot was actually initiated
- name: get current time
command: /bin/date +%s
register: before_reboot
sudo: false
- name: reboot system
shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
async: 1
poll: 0
ignore_errors: true
- name: waiting for server to come back
local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=220
sudo: false
- name: verify a reboot was actually initiated
# machine should have started after it has been rebooted
shell: (( `date +%s` - `awk -F . '{print $1}' /proc/uptime` > {{ before_reboot.stdout }} ))
sudo: false
Note the async option. 1.8 and 2.0 may live with 0 but 1.9 wants it 1. The above also checks if machine has actually been rebooted. This is good because once I had a typo that failed reboot and no indication of the failure.
The big issue is waiting for machine to be up. This version just sits there for 330 seconds and never tries to access host earlier. Some other answers suggest using port 22. This is good if both of these are true:
you have direct access to the machines
your machine is accessible immediately after port 22 is open
These are not always true so I decided to waste 5 minutes compute time.. I hope ansible extend the wait_for module to actually check host state to avoid wasting time.
btw the answer suggesting to use handlers is nice. +1 for handlers from me (and I updated answer to use handlers).
Here's original version but it it not so good and not so reliable:
- name: Reboot
sudo: yes
gather_facts: no
hosts:
- OSEv3:children
tasks:
- name: get current uptime
shell: cat /proc/uptime | awk -F . '{print $1}'
register: uptime
sudo: false
- name: reboot system
shell: sleep 2 && shutdown -r now "Ansible package updates triggered"
async: 1
poll: 0
ignore_errors: true
- name: waiting for server to come back
local_action: wait_for host={{ inventory_hostname }} state=started delay=30 timeout=300
sudo: false
- name: verify a reboot was actually initiated
# uptime after reboot should be smaller than before reboot
shell: (( `cat /proc/uptime | awk -F . '{print $1}'` < {{ uptime.stdout }} ))
sudo: false

2018 Update
As of 2.3, Ansible now ships with the wait_for_connection module, which can be used for exactly this purpose.
#
## Reboot
#
- name: (reboot) Reboot triggered
command: /sbin/shutdown -r +1 "Ansible-triggered Reboot"
async: 0
poll: 0
- name: (reboot) Wait for server to restart
wait_for_connection:
delay: 75
The shutdown -r +1 prevents a return code of 1 to be returned and have ansible fail the task. The shutdown is run as an async task, so we have to delay the wait_for_connection task at least 60 seconds. 75 gives us a buffer for those snowflake cases.
wait_for_connection - Waits until remote system is reachable/usable

I wanted to comment on Shahar post, that he is using a hardcoded host address better is to have it a variable to reference the current host ansible is configuring {{ inventory_hostname }}, so his code will be like that:
- name: Wait for server to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=1
timeout=300

With newer versions of Ansible (i.e. 1.9.1 in my case), poll and async parameters set to 0 are sometimes not enough (may be depending on what distribution is set up ansible ?). As explained in https://github.com/ansible/ansible/issues/10616 one workaround is :
- name: Reboot
shell: sleep 2 && shutdown -r now "Ansible updates triggered"
async: 1
poll: 0
ignore_errors: true
And then, wait for reboot complete as explained in many answers of this page.

Through trial and error + a lot of reading this is what ultimately worked for me using the 2.0 version of Ansible:
$ ansible --version
ansible 2.0.0 (devel 974b69d236) last updated 2015/09/01 13:37:26 (GMT -400)
lib/ansible/modules/core: (detached HEAD bbcfb1092a) last updated 2015/09/01 13:37:29 (GMT -400)
lib/ansible/modules/extras: (detached HEAD b8803306d1) last updated 2015/09/01 13:37:29 (GMT -400)
config file = /Users/sammingolelli/projects/git_repos/devops/ansible/playbooks/test-2/ansible.cfg
configured module search path = None
My solution for disabling SELinux and rebooting a node when needed:
---
- name: disable SELinux
selinux: state=disabled
register: st
- name: reboot if SELinux changed
shell: shutdown -r now "Ansible updates triggered"
async: 0
poll: 0
ignore_errors: true
when: st.changed
- name: waiting for server to reboot
wait_for: host="{{ ansible_ssh_host | default(inventory_hostname) }}" port={{ ansible_ssh_port | default(22) }} search_regex=OpenSSH delay=30 timeout=120
connection: local
sudo: false
when: st.changed
# vim:ft=ansible:

- wait_for:
port: 22
host: "{{ inventory_hostname }}"
delegate_to: 127.0.0.1

In case you don't have DNS setup for the remote server yet, you can pass the IP address instead of a variable hostname:
- name: Restart server
command: shutdown -r now
- name: Wait for server to restart successfully
local_action:
module: wait_for
host={{ ansible_default_ipv4.address }}
port=22
delay=1
timeout=120
These are the two tasks I added to the end of my ansible-swap playbook (to install 4GB of swap on new Digital Ocean droplets.

I've created a reboot_server ansible role that can get dynamically called from other roles with:
- name: Reboot server if needed
include_role:
name: reboot_server
vars:
reboot_force: false
The role content is:
- name: Check if server restart is necessary
stat:
path: /var/run/reboot-required
register: reboot_required
- name: Debug reboot_required
debug: var=reboot_required
- name: Restart if it is needed
shell: |
sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
async: 1
poll: 0
ignore_errors: true
when: reboot_required.stat.exists == true
register: reboot
become: true
- name: Force Restart
shell: |
sleep 2 && /sbin/shutdown -r now "Reboot triggered by Ansible"
async: 1
poll: 0
ignore_errors: true
when: reboot_force|default(false)|bool
register: forced_reboot
become: true
# # Debug reboot execution
# - name: Debug reboot var
# debug: var=reboot
# - name: Debug forced_reboot var
# debug: var=forced_reboot
# Don't assume the inventory_hostname is resolvable and delay 10 seconds at start
- name: Wait 300 seconds for port 22 to become open and contain "OpenSSH"
wait_for:
port: 22
host: '{{ (ansible_ssh_host|default(ansible_host))|default(inventory_hostname) }}'
search_regex: OpenSSH
delay: 10
connection: local
when: reboot.changed or forced_reboot.changed
This was originally designed to work with Ubuntu OS.

I haven't seen a lot of visibility on this, but a recent change (https://github.com/ansible/ansible/pull/43857) added the "ignore_unreachable" keyword. This allows you to do something like this:
- name: restart server
shell: reboot
ignore_unreachable: true
- name: wait for server to come back
wait_for_connection:
timeout: 120
- name: the next action
...

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Ansible reboot not rebooting even though '/var/run/reboot-required' exists - linux

Related

How to apply the changes of a Linux users group assignments inside a local ansible playbook?

Error running playbook that only affects one of the hosts

Ansible - List of Linux security updates needed on remote servers

Ansible file copy with sudo fails after upgrading to 1.9

How to wait for server restart using Ansible?

Categories

Resources