'Ansible can't stop managed service
I have a Spring Boot application managed as a systemd service on a Red Hat Enterprise Linux Server 7.7 (Maipo) cluster. The service unit configuration is all right, I can directly start and stop the service unit by hand on the system that I'm trying to command/control with Ansible 2.8.5.
The process owner is tomcat and I'm using another user (deployer) that can "become" tomcat and run the commands on the hosts. That's fine for some other actions, but it fails when I put the actions to manage the service (I've tried with both systemd and service modules):
# ./ansible/roles/boot-core/tasks/main.yml
---
- name: "Deploy/Install new application"
block:
# - name: "Make sure {{ service_id }} is stopped"
# systemd:
# name: "{{ service_id }}"
# state: stopped
- name: "Make sure {{ service_id }} is stopped"
service:
name: "{{ service_id }}"
state: stopped
# - name: "Make sure {{ service_id }} is enabled and started"
# systemd:
# enabled: yes
# name: "{{ service_id }}"
# state: started
- name: "Make sure {{ service_id }} is enabled and started"
service:
enabled: yes
name: "{{ service_id }}"
state: started
# ./ansible/site.yml
---
- hosts: webservers
any_errors_fatal: true
become_user: tomcat
become: yes
force_handlers: true
gather_facts: no
roles:
- boot-core
...and this is how I'm running the playbook as deployer (on a GitLab pipeline, the syntax is different, so I convert it here to what it would look like in a UN*X shell):
$ eval $(ssh-agent -s)
$ ssh-add <(echo "${PRIVATE_SSH_KEY}")
$ ansible-playbook -vvv \
--extra-vars CI_PIPELINE_ID="${CI_PIPELINE_ID}" \
--extra-vars CI_PROJECT_DIR="${CI_PROJECT_DIR}" \
--inventory-file "${CI_PROJECT_DIR}/infrastructure/ansible/inventories/${ANSIBLE_INVENTORY}" \
--limit webservers
--user deployer
"${CI_PROJECT_DIR}/infrastructure/ansible/site.yml"
This is what's getting printed in the logs:
TASK [boot-core : Make sure boot-core is stopped] ****************************
task path: /builds/x80486/boot-core/infrastructure/ansible/roles/boot-core/tasks/main.yml:58
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'echo ~deployer && sleep 0'"'"''
<unixvm001> (0, '/usr/local/home/deployer\n', "Warning: Permanently added 'unixvm001,10.5.177.1' (ECDSA) to the list of known hosts.\r\n\t\t\t\n")
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /var/tmp/ansible-tmp-1571756098.47-177915759067620 `" && echo ansible-tmp-1571756098.47-177915759067620="` echo /var/tmp/ansible-tmp-1571756098.47-177915759067620 `" ) && sleep 0'"'"''
<unixvm001> (0, 'ansible-tmp-1571756098.47-177915759067620=/var/tmp/ansible-tmp-1571756098.47-177915759067620\n', '')
<unixvm001> Attempting python interpreter discovery
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'echo PLATFORM; uname; echo FOUND; command -v '"'"'"'"'"'"'"'"'/usr/bin/python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python3.5'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.7'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python2.6'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/libexec/platform-python'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'/usr/bin/python3'"'"'"'"'"'"'"'"'; command -v '"'"'"'"'"'"'"'"'python'"'"'"'"'"'"'"'"'; echo ENDFOUND && sleep 0'"'"''
<unixvm001> (0, 'PLATFORM\nLinux\nFOUND\n/usr/bin/python\n/usr/bin/python2.7\n/usr/libexec/platform-python\n/usr/bin/python\nENDFOUND\n', '')
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''
<unixvm001> (0, '{"osrelease_content": "NAME=\\"Red Hat Enterprise Linux Server\\"\\nVERSION=\\"7.7 (Maipo)\\"\\nID=\\"rhel\\"\\nID_LIKE=\\"fedora\\"\\nVARIANT=\\"Server\\"\\nVARIANT_ID=\\"server\\"\\nVERSION_ID=\\"7.7\\"\\nPRETTY_NAME=\\"Red Hat Enterprise Linux Server 7.7 (Maipo)\\"\\nANSI_COLOR=\\"0;31\\"\\nCPE_NAME=\\"cpe:/o:redhat:enterprise_linux:7.7:GA:server\\"\\nHOME_URL=\\"https://www.redhat.com/\\"\\nBUG_REPORT_URL=\\"https://bugzilla.redhat.com/\\"\\n\\nREDHAT_BUGZILLA_PRODUCT=\\"Red Hat Enterprise Linux 7\\"\\nREDHAT_BUGZILLA_PRODUCT_VERSION=7.7\\nREDHAT_SUPPORT_PRODUCT=\\"Red Hat Enterprise Linux\\"\\nREDHAT_SUPPORT_PRODUCT_VERSION=\\"7.7\\"\\n", "platform_dist_result": ["redhat", "7.7", "Maipo"]}\n', '')
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/setup.py
<unixvm001> PUT /root/.ansible/tmp/ansible-local-4292hN1DYR/tmpYM8xgh TO /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_setup.py
<unixvm001> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 '[unixvm001]'
<unixvm001> (0, 'sftp> put /root/.ansible/tmp/ansible-local-4292hN1DYR/tmpYM8xgh /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_setup.py\n', '')
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'setfacl -m u:tomcat:r-x /var/tmp/ansible-tmp-1571756098.47-177915759067620/ /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_setup.py && sleep 0'"'"''
<unixvm001> (0, '', '')
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 -tt unixvm001 '/bin/sh -c '"'"'sudo -H -S -n -u tomcat /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-ldekvnpzwgrgribssedqdqimvuzpvozm ; /usr/bin/python /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_setup.py'"'"'"'"'"'"'"'"' && sleep 0'"'"''
Escalation succeeded
<unixvm001> (0, '\r\n{"invocation": {"module_args": {"filter": "ansible_service_mgr", "gather_subset": ["!all"], "fact_path": "/etc/ansible/facts.d", "gather_timeout": 10}}, "ansible_facts": {"ansible_service_mgr": "systemd"}}\r\n', 'Shared connection to unixvm001 closed.\r\n')
Using module file /usr/lib/python2.7/site-packages/ansible/modules/system/systemd.py
<unixvm001> PUT /root/.ansible/tmp/ansible-local-4292hN1DYR/tmpuOv4ys TO /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_systemd.py
<unixvm001> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 '[unixvm001]'
<unixvm001> (0, 'sftp> put /root/.ansible/tmp/ansible-local-4292hN1DYR/tmpuOv4ys /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_systemd.py\n', '')
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'setfacl -m u:tomcat:r-x /var/tmp/ansible-tmp-1571756098.47-177915759067620/ /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_systemd.py && sleep 0'"'"''
<unixvm001> (0, '', '')
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 -tt unixvm001 '/bin/sh -c '"'"'sudo -H -S -n -u tomcat /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-nnwsmnabevfloceodiibjgkauxvxykgu ; /usr/bin/python /var/tmp/ansible-tmp-1571756098.47-177915759067620/AnsiballZ_systemd.py'"'"'"'"'"'"'"'"' && sleep 0'"'"''
Escalation succeeded
<unixvm001> (1, '\x1b[1;31m==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units ===\r\n\x1b[0mAuthentication is required to manage system services or units.\r\nAuthenticating as: Unix Admin (rsc_sys)\r\nPassword: \r\n{"msg": "Unable to stop service boot-core: Failed to stop boot-core.service: Connection timed out\\nSee system logs and \'systemctl status boot-core.service\' for details.\\n", "failed": true, "invocation": {"module_args": {"no_block": false, "force": null, "name": "boot-core", "daemon_reexec": false, "enabled": null, "daemon_reload": false, "state": "stopped", "masked": null, "scope": null, "user": null}}}\r\n', 'Shared connection to unixvm001 closed.\r\n')
<unixvm001> Failed to connect to the host via ssh: Shared connection to unixvm001 closed.
<unixvm001> ESTABLISH SSH CONNECTION FOR USER: deployer
<unixvm001> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="deployer"' -o ConnectTimeout=10 -o ControlPath=/root/.ansible/cp/ea5c024329 unixvm001 '/bin/sh -c '"'"'rm -f -r /var/tmp/ansible-tmp-1571756098.47-177915759067620/ > /dev/null 2>&1 && sleep 0'"'"''
<unixvm001> (0, '', '')
fatal: [unixvm001]: FAILED! => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python"
},
"changed": false,
"invocation": {
"module_args": {
"daemon_reexec": false,
"daemon_reload": false,
"enabled": null,
"force": null,
"masked": null,
"name": "boot-core",
"no_block": false,
"scope": null,
"state": "stopped",
"user": null
}
},
"msg": "Unable to stop service boot-core: Failed to stop boot-core.service: Connection timed out\nSee system logs and 'systemctl status boot-core.service' for details.\n"
}
NO MORE HOSTS LEFT *************************************************************
PLAY RECAP *********************************************************************
unixvm001 : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
I remembered I've seen this message: Authentication is required to manage system services or units.\r\nAuthenticating as: Unix Admin (rsc_sys) before when sudo privileges were not applied correctly on the hosts and I was trying to start / stop the service unit by hand, but I'm not sure why it's showing up with Ansible here now.
This is what I get when I do sudo -l:
[deployer@unixvm001 ~]$ sudo -l
Matching Defaults entries for deployer on unixvm001:
ignore_dot, !mail_no_user, !root_sudo, !syslog, timestamp_timeout=10, logfile=/var/log/sudo.log, pwfeedback, passwd_timeout=5, passwd_tries=3, umask_override,
umask=0027, log_host, visiblepw, env_keep+=SSH_AUTH_SOCK, ignore_dot, !mail_no_user, !root_sudo, !syslog, timestamp_timeout=10, logfile=/var/log/sudo.log, pwfeedback,
passwd_timeout=5, passwd_tries=3, umask_override, umask=0027, log_host, visiblepw, env_keep+=SSH_AUTH_SOCK
User deployer may run the following commands on unixvm001:
(root) NOPASSWD: /sbin/multipath -ll, /sbin/ifconfig -a, /usr/bin/ipmitool lan print, /usr/sbin/dmidecode -s system-product-name, /usr/sbin/dmidecode -s
system-serial-number, /usr/bin/last, /usr/sbin/nscd -i hosts, /usr/local/bin/ports, /bin/cat /var/log/dmesg
(oem) NOPASSWD: /usr/oem/agent/agent_inst/bin/emctl, /opt/oracle-oem/bin/emctl, /usr/oem/bin/emctl, /opt/oracle-oem/agent/agent_inst/bin/emctl,
/u01/oracle/agent/agent_inst/bin/emctl
(tomcat) NOPASSWD: ALL, !/bin/su
(root) NOPASSWD: /bin/systemctl * tomcat*, /bin/view /var/log/messages, /bin/systemctl * boot-core*, /bin/systemctl daemon-reload
(tomcat) NOPASSWD: /bin/systemctl * boot-core*
Again, on the hosts I can do: sudo /bin/systemctl stop boot-core.service (same with start) and everything is fine, although if I do only systemctl stop boot-core.service I would get the same error message:
[deployer@unixvm001~]$ systemctl stop boot-core.service
==== AUTHENTICATING FOR org.freedesktop.systemd1.manage-units ===
Authentication is required to manage system services or units.
Authenticating as: Unix Admin (rsc_sys)
Password:
Any clues what's going on here? I believe the sudo privileges should be tweaked, but I'm not entirely sure.
UPDATE:
I modified the Ansible script (just for testing) to use the command module:
- name: "Make sure {{ service_id }} is stopped"
command: "sudo systemctl stop {{ service_id }}"
- name: "Make sure {{ service_id }} is started"
command: "sudo systemctl start {{ service_id }}"
...and it "does work" (though I have to use sudo, it does not work using become: yes and removing sudo from the command):
Oct 24 13:29:35 : deployer : HOST=unixvm001 : TTY=pts/2 ; PWD=/usr/local/home/deployer ; USER=tomcat ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-acdbbxcaetxxlfgnnbvtmrxcofktyjnw ; /usr/bin/python /var/tmp/ansible-tmp-1571938173.5-172296377610468/AnsiballZ_command.py
Oct 24 13:29:36 : tomcat : HOST=unixvm001 : TTY=pts/2 ; PWD=/usr/local/home/tomcat/.ansible/tmp/ansible-moduletmp-1571938175.42-_jtzB0 ; USER=root ; COMMAND=/usr/bin/systemctl stop boot-core.service
Oct 24 13:29:37 : deployer : HOST=unixvm001 : TTY=pts/2 ; PWD=/usr/local/home/deployer ; USER=tomcat ; COMMAND=/bin/sh -c echo BECOME-SUCCESS-utdnsysqmyzkactqhadmmoiujwounyru ; /usr/bin/python /var/tmp/ansible-tmp-1571938176.98-167412210657077/AnsiballZ_command.py
Oct 24 13:29:37 : tomcat : HOST=unixvm001 : TTY=pts/2 ; PWD=/usr/local/home/tomcat/.ansible/tmp/ansible-moduletmp-1571938177.75-k7qyDh ; USER=root ; COMMAND=/usr/bin/systemctl start boot-core.service
---
Oct 24 13:29:37 unixvm001 python: ansible-command Invoked with creates=None executable=None _uses_shell=False strip_empty_ends=True _raw_params=sudo systemctl start boot-core.service removes=None argv=None warn=True chdir=None stdin_add_newline=True stdin=None
Oct 24 13:29:37 unixvm001 python: ansible-command [WARNING] Consider using 'become', 'become_method', and 'become_user' rather than running sudo
Solution 1:[1]
It sounds like you're running a playbook as the tomcat user but then trying to manage a service, this won't work. If you ssh into that machine as the tomcat user and just try to run a systemctl command without escalating privileges then it won't work by hand either. It seems as though you're telling the playbook to do one thing and then doing a completely different thing by hand and calling them equivalent but that one of them isn't working properly. I suspect this is not the case (but I could be wrong and bugs do happen).
You could either break this up as multiple plays, each play set to different users, or with appropriate become options and divide the tasks that way. Alternatively, you can set privilege escalation for those tasks specifically (or at the block level). https://docs.ansible.com/ansible/latest/user_guide/become.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Adam Miller |
