Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to retrieve current config on some os10 switches #130

Closed
cyrilstoll opened this issue Jun 20, 2022 · 4 comments
Closed

Unable to retrieve current config on some os10 switches #130

cyrilstoll opened this issue Jun 20, 2022 · 4 comments

Comments

@cyrilstoll
Copy link

cyrilstoll commented Jun 20, 2022

Hi, we have several os10 switches as well as some os6 and some os9 ones. With my role and playbook I am able to download the config of all os6 and os9 switches but only some os10 switches. All of the os10 ones are the S4148F-ON model.

Works with:

OS Version: 10.5.2.0
Build Version: 10.5.2.0.232
Build Time: 2020-09-24T18:40:26+0000
System Type: S4148F-ON

Does not work with:

OS Version: 10.5.2.3
Build Version: 10.5.2.3.316
Build Time: 2021-02-26T20:03:25+0000
System Type: S4148F-ON

Does also not work with:

OS Version: 10.5.0.4
Build Version: 10.5.0.4.638
Build Time: 2020-01-30T21:08:56+0000
System Type: S4148F-ON

We use the following Ansible and Python versions:

ansible 2.9.27
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/DATA/ansible/library']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Nov 16 2020, 22:23:17) [GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]

We have a defaults file in the role with the following vars. I also unsuccessfully tried setting the ansible_become var to "no" as it is not necessary to type enable when showing the running-config directly on the switch after logging in with SSH.

ansible_connection: network_cli
backup_directory: /home/username/config_backups
ansible_user: switchuser
ansible_become: yes
ansible_become_method: enable

The tasks file in the roles looks basically the same for all types of switches but of course using the relevant modules. Only showing the task for the os10 ones here. I have other tasks to create the necessary folders to store the config and they work fine so are omitted here

- name: backup config of all os10 switches
  dellos10_config:
    backup: true
    save: false
    backup_options:
      dir_path: "{{ backup_directory }}/{{ inventory_hostname }}"
      filename: "{{ inventory_hostname }}_{{ lookup('pipe', 'date +%F') }}.cfg"
  when: ansible_network_os == 'dellos10'

In the inventory file all the switches have the ansible_host and dell_os_version vars like so:

switch-01 ansible_host=192.168.1.91 dell_os_version=dellos10
switch-02 ansible_host=192.168.1.92 dell_os_version=dellos10

The playbook is rather simple:

- hosts: dellswitches
  gather_facts: no
  roles:
    - backup_dell_switches

The playbook is executed with this command:

ansible-playbook -i inventory backup_dellswitches.yml -k

I already tried increasing the ansible_timeout to 90 seconds. And setting ansible_python_interpreter to the also installed python3.6.
Also setting some environment vars that I found while researching about this issue:

environment:
  ANSIBLE_TIMEOUT: "90"
  PYTHONPATH: "/usr/lib/opx:/usr/lib/x86_64-linux-gnu/opx"
  LD_LIBRARY_PATH: "/usr/lib/opx:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib:/lib"
  ANSIBLE_NETWORK_GROUP_MODULES: "os10"

Finally the error message I get when running with -vvv and limiting to one switch is:

TASK [backup_dell_switches : backup config of all os10 switches] *********************************************************************************************************************************************
task path: /home/username/ansible_dell/roles/backup_dell_switches/tasks/main.yml:41
<192.168.1.91> ESTABLISH LOCAL CONNECTION FOR USER: username
<192.168.1.91> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /home/username/.ansible/tmp/ansible-local-146376QaAVO `"&& mkdir "` echo /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806 `" && echo ansible-tmp-1655716995.95-14690-90008897111806="` echo /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806 `" ) && sleep 0'
<switch-01> Attempting python interpreter discovery
<192.168.1.91> EXEC /bin/sh -c 'echo PLATFORM; uname; echo FOUND; command -v '"'"'/usr/bin/python'"'"'; command -v '"'"'python3.7'"'"'; command -v '"'"'python3.6'"'"'; command -v '"'"'python3.5'"'"'; command -v '"'"'python2.7'"'"'; command -v '"'"'python2.6'"'"'; command -v '"'"'/usr/libexec/platform-python'"'"'; command -v '"'"'/usr/bin/python3'"'"'; command -v '"'"'python'"'"'; echo ENDFOUND && sleep 0'
<192.168.1.91> EXEC /bin/sh -c '/usr/bin/python && sleep 0'
Using module file /usr/lib/python2.7/site-packages/ansible/modules/network/dellos10/dellos10_config.py
<192.168.1.91> PUT /home/username/.ansible/tmp/ansible-local-146376QaAVO/tmpEllrKW TO /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806/AnsiballZ_dellos10_config.py
<192.168.1.91> EXEC /bin/sh -c 'chmod u+x /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806/ /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806/AnsiballZ_dellos10_config.py && sleep 0'
<192.168.1.91> EXEC /bin/sh -c '/usr/bin/python /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806/AnsiballZ_dellos10_config.py && sleep 0'
<192.168.1.91> EXEC /bin/sh -c 'rm -f -r /home/username/.ansible/tmp/ansible-local-146376QaAVO/ansible-tmp-1655716995.95-14690-90008897111806/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
WARNING: The below traceback may *not* be related to the actual failure.
  File "/tmp/ansible_dellos10_config_payload_Y4PM3u/ansible_dellos10_config_payload.zip/ansible/module_utils/network/dellos10/dellos10.py", line 86, in get_config
    return _DEVICE_CONFIGS[cmd]
fatal: [switch-01]: FAILED! => {
    "ansible_facts": {
        "discovered_interpreter_python": "/usr/bin/python"
    },
    "changed": false,
    "invocation": {
        "module_args": {
            "after": null,
            "auth_pass": null,
            "authorize": null,
            "backup": true,
            "backup_options": {
                "dir_path": "/home/username/config_backups/switch-01",
                "filename": "switch-01_2022-06-20.cfg"
            },
            "before": null,
            "config": null,
            "host": null,
            "lines": null,
            "match": "line",
            "parents": null,
            "password": null,
            "port": null,
            "provider": null,
            "replace": "line",
            "save": false,
            "src": null,
            "ssh_keyfile": null,
            "timeout": null,
            "update": "merge",
            "username": null
        }
    },
    "msg": "unable to retrieve current config",
    "stderr": "Internal error",
    "stderr_lines": [
        "Internal error"
    ]
}

PLAY RECAP ***************************************************************************************************************************************************************************************************
switch-01                : ok=2    changed=0    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0

Unfortunately I don't understand what the issue is. Especially since it only works on those with the middle version. That means it works on OS 10.5.2.0 but not on the older 10.5.0.4 and neither on the newer 10.5.2.3. I therefore have my doubts that the OS version is the culprit as DELL would have fixed an issue and then broke it again only three minor (or however the 4th number is called) updates later.
Then again my ansible stuff should also not be at fault as it works fine with some os10 switches as well as all our os6 and os9 switches.
Would be really glad if somebody had any suggestions on what else I could try. Of course I can supply more information in case I missed anything.
Thanks in advance for having a look at this issue and best regards, cyrilstoll

@zerwes
Copy link

zerwes commented Dec 30, 2022

as I went today again into the #114 trap ... (ff of #113)
can you try adding

collections:
   - dellemc.os10

on top of the task?

@cyrilstoll
Copy link
Author

Thanks @zerwes for the hint. I kinda forgot about this issue as I don't work for the same company anymore. Will try to let my former co-workers know about your suggested solution. They might however have switched to another product for backing up the configs by now. So I am not sure if they will test this solution. Thus I don't really know how to proceed with this bug report. In my opinion it could be closed but I am not sure if the underlying issue is actually resolved with the suggested resolution as I can no longer test it myself lacking access to dell switches.

@zerwes
Copy link

zerwes commented Jan 2, 2023

as I [am] lacking access to dell switches.

Lucky you :-) I do not appreciate the switches nor the collection ... but I have to deal with it.
If you can not pass the issue to one of your former colleagues, you can close ist IMHO, as it makes no sense without reproducible tests ...

Happy new year

@cyrilstoll
Copy link
Author

Happy new year to you too, thanks!

Just heard back from my former co-workers and they indeed switched to another solution for fetching config backups. I therefore close this issue. Sorry for not doing so earlier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants