Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing pre-flight test/setup to configure NetworkManager to use dns=dnsmasq #119

Closed
skabashnyuk opened this issue May 8, 2019 · 15 comments · Fixed by #124
Closed

Missing pre-flight test/setup to configure NetworkManager to use dns=dnsmasq #119

skabashnyuk opened this issue May 8, 2019 · 15 comments · Fixed by #124
Assignees
Labels
Projects

Comments

@skabashnyuk
Copy link
Contributor

Got this error on Fedora 30.

[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ ./crc start  -b ../crc_libvirt_v4.1.0.rc0.tar.xz 
crc - Local OpenShift 4.x cluster
INFO  Checking if Virtualization is enabled       
INFO  Checking if KVM is enabled                  
INFO  Checking if Libvirt is installed            
INFO  Checking if user is part of libvirt group   
INFO  Checking if Libvirt is enabled              
INFO  Checking if Libvirt daemon is running       
INFO  Checking if crc-driver-libvirt is installed 
INFO  Checking if Libvirt crc network is available 
INFO  Checking if Libvirt crc network is active   
INFO  Checking if /etc/NetworkManager/dnsmasq.d/crc.conf exists 
INFO  Extracting the Bundle tarball ...           
INFO  Creating VM ...                             
INFO  Waiting 3m0s for the openshift cluster to be started ... 
INFO  To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/ksm/.crc/cache/crc_libvirt_v4.1.0.rc0/kubeconfig' 
INFO  Access the OpenShift web-console here: https://console-openshift-console.apps.tt.testing 
INFO  Login to the console with user: kubeadmin, password: xxxxxxxxxx
INFO Running                                      
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ ./crc config^C
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ ^C
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ ^C
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ oc get co
Unable to connect to the server: dial tcp: lookup api.crc.tt.testing on 192.168.1.1:53: no such host
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ dig api.test1.tt.testing

; <<>> DiG 9.11.5-P4-RedHat-9.11.5-13.P4.fc30 <<>> api.test1.tt.testing
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 31395
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;api.test1.tt.testing.		IN	A

;; AUTHORITY SECTION:
.			1800	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019050801 1800 900 604800 86400

;; Query time: 31 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Wed May 08 23:38:16 EEST 2019
;; MSG SIZE  rcvd: 124

[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ sudo virsh net-list --all
 Name      State    Autostart   Persistent
--------------------------------------------
 crc       active   yes         yes
 default   active   yes         yes

Any suggestions on what to do?

@jmazzitelli
Copy link

I was just going to create this same issue :) I just ran into this on Fedora 29:

The output from crc start:

INFO  Creating VM ...                             
INFO  Waiting 3m0s for the openshift cluster to be started ... 
INFO  To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/mazz/.crc/cache/crc_libvirt_v4.1.0.rc0/kubeconfig' 
INFO  Access the OpenShift web-console here: https://console-openshift-console.apps.tt.testing 

What I try to do after it starts:

$ export KUBECONFIG=/home/mazz/.crc/cache/crc_libvirt_v4.1.0.rc0/kubeconfig
$ oc login
error: dial tcp: lookup api.crc.tt.testing: No address associated with hostname - verify you have provided the correct host and port and that the server is currently running.
$ ping console-openshift-console.apps.tt.testing
ping: console-openshift-console.apps.tt.testing: No address associated with hostname

@anjannath
Copy link
Member

anjannath commented May 9, 2019

What is the content of /etc/NetworkManager/dnsmasq.d/crc.conf, also try restarting the NetworkManager service.

Note: If you ran our PoC earlier then there'll be a file /etc/NetworkManager/dnsmasq.d/openshift.conf, remove that and restart NetworkManager.

@praveenkumar
Copy link
Member

@jmazzitelli @skabashnyuk do you folks run the crc setup command before running it?

As per output of @skabashnyuk logs 192.168.1.1#53 is the dns server which shouldn't be the case so I am not sure if you are using a custom dns somewhere?

@jmazzitelli Can you also try to provide output of dig command?

@gbraad
Copy link
Contributor

gbraad commented May 9, 2019

It might also be the problem as described in #25. More info is needed

@jmazzitelli
Copy link

  1. My content of /etc/NetworkManager/dnsmasq.d/crc.conf
server=/tt.testing/192.168.130.1
address=/apps.tt.testing/192.168.130.11
  1. There is no openshift.conf:
$ ls /etc/NetworkManager/dnsmasq.d/openshift.conf
ls: cannot access '/etc/NetworkManager/dnsmasq.d/openshift.conf': No such file or directory
  1. Yes, I run crc setup prior to running crc start

  2. dig and virsh net-list output::

$ dig api.crc.tt.testing

; <<>> DiG 9.11.5-P4-RedHat-9.11.5-4.P4.fc29 <<>> api.crc.tt.testing
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 38898
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;api.crc.tt.testing.		IN	A

;; AUTHORITY SECTION:
.			10345	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019050900 1800 900 604800 86400

;; Query time: 63 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu May 09 04:18:36 EDT 2019
;; MSG SIZE  rcvd: 122
$ sudo virsh net-list --all
[sudo] password for mazz: 
 Name                 State      Autostart     Persistent
----------------------------------------------------------
 crc                  active     yes           yes
 default              active     yes           yes
  1. 192.168.1.1#53 is also showing as my DNS server if I look at the dig output - that is my home router. But that just is my DNS server. I don't do anything custom here - it's how my machines on my local network access DNS. But you let me know if there is something I should change here - I'm no DNS server expert :)

@jmazzitelli
Copy link

  1. I shutdown the cluster (crc stop), restarted NM (sudo systemctl restart NetworkManager), restarted the cluster (crc setup; crc start), but get the same issue.

@anjannath
Copy link
Member

@jmazzitelli Thanks for the detailed info, i think in your case dnsmasq is not been started by NM, and dhcp is setting your dns server to 192.168.1.1.

Can you paste the o/p cat /etc/resolv.conf and systemctl status NetworkManager

@skabashnyuk
Copy link
Contributor Author

@praveenkumar

do you folks run the crc setup command before running it?

yes

192.168.1.1#53 is the dns server which shouldn't be the case

192.168.1.1 is the IP of my router.

[ksm@DESKTOP-6AM8T06 ~]$ cat /etc/NetworkManager/dnsmasq.d/crc.conf
server=/tt.testing/192.168.130.1
address=/apps.tt.testing/192.168.130.11
[ksm@DESKTOP-6AM8T06 ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
nameserver 192.168.1.1
[ksm@DESKTOP-6AM8T06 ~]$ sudo systemctl status NetworkManager
[sudo] password for ksm: 
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-05-09 14:07:28 EEST; 7min ago
     Docs: man:NetworkManager(8)
 Main PID: 1071 (NetworkManager)
    Tasks: 4 (limit: 4915)
   Memory: 16.8M
   CGroup: /system.slice/NetworkManager.service
           ├─1071 /usr/sbin/NetworkManager --no-daemon
           └─1394 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp3s0.pid -lf /var/lib/NetworkManager/dhclient-cb60bd95-6f62-3dbf-983c-752896ce144f-enp3s0.lease -cf /var/lib/NetworkManager/dhclient-enp3s0.co>

May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6773] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6773] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6773] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6773] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6773] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6774] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <warn>  [1557400050.6774] dns-sd-resolved[0x55be922a0820]: Failed: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not activate remote peer.
May 09 14:07:30 localhost.localdomain NetworkManager[1071]: <info>  [1557400050.6774] policy: set-hostname: set hostname to 'DESKTOP-6AM8T06' (from address lookup)
May 09 14:07:30 DESKTOP-6AM8T06 NetworkManager[1071]: <info>  [1557400050.9614] manager: NetworkManager state is now CONNECTED_GLOBAL
May 09 14:08:51 DESKTOP-6AM8T06 NetworkManager[1071]: <info>  [1557400131.0800] agent-manager: req[0x55be9233b330, :1.205/org.gnome.Shell.NetworkAgent/1000]: agent registered

@anjannath
Copy link
Member

Can you guys create the file as following, then restart NetworkManager..

$ cat /etc/NetworkManager/conf.d/crc-libvirt-dnsmasq.conf 
[main]
dns=dnsmasq

We need to add preflight and setup code for this too..

@skabashnyuk
Copy link
Contributor Author

works for me

[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ sudo touch /etc/NetworkManager/conf.d/crc-libvirt-dnsmasq.conf 
[sudo] password for ksm: 
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ sudo mcedit /etc/NetworkManager/conf.d/crc-libvirt-dnsmasq.conf 

[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ sudo systemctl restart  NetworkManager
[ksm@DESKTOP-6AM8T06 crc-0.85.0-linux-amd64]$ oc get co
NAME                                 VERSION      AVAILABLE   PROGRESSING   FAILING   SINCE
authentication                       4.1.0-rc.0   True        False         False     12d
cloud-credential                     4.1.0-rc.0   True        False         False     12d
cluster-autoscaler                   4.1.0-rc.0   True        False         False     12d
console                              4.1.0-rc.0   False       True          False     103m
dns                                  4.1.0-rc.0   True        False         False     104m
image-registry                       4.1.0-rc.0   True        False         False     12d
ingress                              4.1.0-rc.0   True        False         False     104m
kube-apiserver                       4.1.0-rc.0   True        False                   12d
kube-controller-manager              4.1.0-rc.0   True        False                   12d
kube-scheduler                       4.1.0-rc.0   True        False                   12d
machine-api                          4.1.0-rc.0   True        False         False     12d
machine-config                       4.1.0-rc.0   False       False         True      12d
marketplace                          4.1.0-rc.0   False       False         True      12d
monitoring                                        Unknown     True          Unknown   12d
network                              4.1.0-rc.0   True        False                   12d
node-tuning                          4.1.0-rc.0   True        False         False     12d
openshift-apiserver                  4.1.0-rc.0   True        False                   12d
openshift-controller-manager         4.1.0-rc.0   True        False                   11d
openshift-samples                    4.1.0-rc.0   True        False         False     12d
operator-lifecycle-manager           4.1.0-rc.0   True        False         False     12d
operator-lifecycle-manager-catalog   4.1.0-rc.0   True        False         False     12d
service-ca                           4.1.0-rc.0   True        False         False     12d
service-catalog-apiserver            4.1.0-rc.0   True        False         False     103m
service-catalog-controller-manager   4.1.0-rc.0   True        False         False     103m
storage                              4.1.0-rc.0   True        False         False     12d

@jmazzitelli
Copy link

jmazzitelli commented May 9, 2019

@anjannath - here's the output

(my first attempt at this comment was wrong - didn't notice I was booted from my session - had to ssh back into my machine)

$ cat /etc/resolv.conf 
# Generated by NetworkManager
nameserver 127.0.0.1

$ systemctl status NetworkManager
● NetworkManager.service - Network Manager
   Loaded: loaded (/usr/lib/systemd/system/NetworkManager.service; enabled; vendor preset: enabled)
   Active: active (running) since Thu 2019-05-09 10:15:41 EDT; 40s ago
     Docs: man:NetworkManager(8)
 Main PID: 2048 (NetworkManager)
    Tasks: 5 (limit: 4915)
   Memory: 21.0M
   CGroup: /system.slice/NetworkManager.service
           ├─2048 /usr/sbin/NetworkManager --no-daemon
           ├─2099 /sbin/dhclient -d -q -sf /usr/libexec/nm-dhcp-helper -pf /var/run/dhclient-enp6s0.pid -lf /var/li>
           └─2121 /usr/sbin/dnsmasq --no-resolv --keep-in-foreground --no-hosts --bind-interfaces --pid-file=/var/r>

May 09 10:15:42 mazztower dnsmasq[2121]: chown of PID file /var/run/NetworkManager/dnsmasq.pid failed: Operation no>
May 09 10:15:42 mazztower dnsmasq[2121]: DBus support enabled: connected to system bus
May 09 10:15:42 mazztower dnsmasq[2121]: using nameserver 192.168.130.1#53 for domain tt.testing
May 09 10:15:42 mazztower dnsmasq[2121]: cleared cache
May 09 10:15:42 mazztower NetworkManager[2048]: <info>  [1557411342.1476] dnsmasq[0x55847b4d0c60]: dnsmasq appeared>
May 09 10:15:42 mazztower dnsmasq[2121]: setting upstream servers from DBus
May 09 10:15:42 mazztower dnsmasq[2121]: using nameserver 192.168.130.1#53 for domain tt.testing
May 09 10:15:42 mazztower dnsmasq[2121]: using nameserver 192.168.1.1#53(via enp6s0)
May 09 10:15:42 mazztower dnsmasq[2121]: using nameserver 192.168.1.1#53 for domain 1.168.192.in-addr.arpa
May 09 10:15:42 mazztower dnsmasq[2121]: cleared cache

@jmazzitelli
Copy link

jmazzitelli commented May 9, 2019

I was asked for this in the slack channel:

$ dig api.crc.tt.testing

; <<>> DiG 9.11.5-P4-RedHat-9.11.5-4.P4.fc29 <<>> api.crc.tt.testing
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 35402
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
;; QUESTION SECTION:
;api.crc.tt.testing.		IN	A

;; AUTHORITY SECTION:
.			10800	IN	SOA	a.root-servers.net. nstld.verisign-grs.com. 2019050900 1800 900 604800 86400

;; Query time: 1306 msec
;; SERVER: 192.168.1.1#53(192.168.1.1)
;; WHEN: Thu May 09 10:14:15 EDT 2019
;; MSG SIZE  rcvd: 122

@gbraad gbraad added kind/bug Something isn't working os/linux priority/major labels May 9, 2019
@gbraad
Copy link
Contributor

gbraad commented May 9, 2019

Missing test for setting NetworkManager to use dnsmasq ?

@gbraad gbraad changed the title Unable to connect to the server: dial tcp: lookup api.crc.tt.testing on 192.168.1.1:53: no such host Missing pre-flight test/setup to configure NetworkManager to use dns=dnsmasq May 9, 2019
@gbraad gbraad added this to To do in Sprint 166 via automation May 9, 2019
@jmazzitelli
Copy link

OK, this fixed it:

Can you guys create the file as following, then restart NetworkManager..

$ cat /etc/NetworkManager/conf.d/crc-libvirt-dnsmasq.conf 
[main]
dns=dnsmasq

Thanks!

@fabiand
Copy link

fabiand commented May 10, 2019

Works for me:

…
INFO  To access the cluster as the system:admin user when using 'oc', run 'export KUBECONFIG=/home/fabiand/.crc/cache/crc_libvirt_v4.1.0.rc0/kubeconfig' 
INFO  Access the OpenShift web-console here: https://console-openshift-console.apps.tt.testing 
INFO  Login to the console with user: kubeadmin, password: DBgjT-T45U2-4yn3o-RiycE 
INFO Running                                      
[fabiand@seven ~]$ export KUBECONFIG=/home/fabiand/.crc/cache/crc_libvirt_v4.1.0.rc0/kubeconfig
[fabiand@seven ~]$ oc get pods
Unable to connect to the server: dial tcp: lookup api.crc.tt.testing on 10.38.5.26:53: no such host

[fabiand@seven ~]$ sudo vi /etc/NetworkManager/conf.d/crc-libvirt-dnsmasq.conf 
[fabiand@seven ~]$ oc get pods
Unable to connect to the server: dial tcp: lookup api.crc.tt.testing on 10.38.5.26:53: no such host

[fabiand@seven ~]$ sudo systemctl reload NetworkManager

[fabiand@seven ~]$ oc get co
NAME                                 AGE
authentication                       13d
cloud-credential                     13d
cluster-autoscaler                   13d
console                              13d
…

Sprint 166 automation moved this from To do to Done May 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Sprint 166
  
Done
Development

Successfully merging a pull request may close this issue.

6 participants