'Ubuntu systemd-resolve not using correct DNS server for certain domain
I've been setting up a nomad cluster together with consul and encountered a problem concerning the consul DNS service running on 127.0.0.1:8600.
I'm using systemd-resolve and iptables for forwarding consul requests according to the official documentation, hence my resolved.conf files looks like this:
[Resolve]
DNS=127.0.0.1
Domains=~consul
[Resolve]
DNS=111.152.1.1 111.152.1.5
#FallbackDNS=
Domains=example.com
#LLMNR=no
#MulticastDNS=no
#DNSSEC=no
#DNSOverTLS=no
#Cache=no-negative
#DNSStubListener=yes
#ReadEtcHosts=yes
When restarting resolved and networkd, the active DNS server is set to localhost and queries, such as host active.vault.service.dc1.consul, are sent correctly to the consul DNS service:
Looking up RR for active.vault.service.dc1.consul IN A.
Switching to DNS server 111.152.1.5 for interface ens192.
Switching to system DNS server 127.0.0.1.
Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/resolve1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=5 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Cache miss for active.vault.service.dc1.consul IN A
Transaction 47973 for <active.vault.service.dc1.consul IN A> scope dns on */*.
Using feature level UDP+EDNS0 for transaction 47973.
Using DNS server 127.0.0.1 for transaction 47973.
Sending query packet with id 47973.
Processing query...
Processing incoming packet on transaction 47973 (rcode=SUCCESS).
Verified we get a response at feature level UDP+EDNS0 from DNS server 127.0.0.1.
Transaction 47973 for <active.vault.service.dc1.consul IN A> on scope dns on */* now complete with <success> from network (unsigned).
Sending response packet with id 7470 on interface 1/AF_INET.
Freeing transaction 47973.
Checking the host of the domain host example.com switches the current DNS server to 111.152.1.1 and works correctly as well:
Looking up RR for example.com IN A.
Cache miss for example.com IN A
Transaction 4122 for <example.com IN A> scope dns on */*.
Using feature level UDP+EDNS0 for transaction 4122.
Using DNS server 127.0.0.1 for transaction 4122.
Sending query packet with id 4122.
Cache miss for example.com IN A
Transaction 7037 for <example.com IN A> scope dns on ens192/*.
Using feature level UDP+EDNS0 for transaction 7037.
Using DNS server 111.152.1.5 for transaction 7037.
Sending query packet with id 7037.
Processing query...
Processing incoming packet on transaction 4122 (rcode=REFUSED).
Server returned REFUSED, switching servers, and retrying.
Retrying transaction 4122.
Switching to system DNS server 111.152.1.1.
Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/resolve1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=6 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
Cache miss for example.com IN A
Transaction 4122 for <example.com IN A> scope dns on */*.
Using feature level UDP+EDNS0 for transaction 4122.
Using DNS server 111.152.1.1 for transaction 4122.
Sending query packet with id 4122.
Processing incoming packet on transaction 7037 (rcode=SUCCESS).
Verified we get a response at feature level UDP+EDNS0 from DNS server 111.152.1.5.
Added positive unauthenticated cache entry for example.com IN A 1688s on ens192/INET/111.152.1.5
Transaction 7037 for <example.com IN A> on scope dns on ens192/* now complete with <success> from network (unsigned).
Freeing transaction 4122.
However, when trying host active.vault.service.dc1.consul again, it is not resolved correctly:
Looking up RR for active.vault.service.dc1.consul IN A.
Cache miss for active.vault.service.dc1.consul IN A
Transaction 21878 for <active.vault.service.dc1.consul IN A> scope dns on */*.
Using feature level UDP+EDNS0 for transaction 21878.
Using DNS server 111.152.1.1 for transaction 21878.
Sending query packet with id 21878.
Processing query...
Processing incoming packet on transaction 21878 (rcode=NXDOMAIN).
Server returned error NXDOMAIN in EDNS0 mode, retrying transaction with reduced feature level UDP (DVE-2018-0001 mitigation)
Retrying transaction 21878.
Cache miss for active.vault.service.dc1.consul IN A
Transaction 21878 for <active.vault.service.dc1.consul IN A> scope dns on */*.
Using feature level UDP for transaction 21878.
Sending query packet with id 21878.
The output of systemd-resolve --status looks as follows:
Global
LLMNR setting: no
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server: 111.152.1.1
DNS Servers: 127.0.0.1
111.152.1.1
111.152.1.5
DNS Domain: ~consul
example.com
DNSSEC NTA: 10.in-addr.arpa
16.172.in-addr.arpa
168.192.in-addr.arpa
17.172.in-addr.arpa
18.172.in-addr.arpa
19.172.in-addr.arpa
20.172.in-addr.arpa
21.172.in-addr.arpa
22.172.in-addr.arpa
23.172.in-addr.arpa
24.172.in-addr.arpa
25.172.in-addr.arpa
26.172.in-addr.arpa
27.172.in-addr.arpa
28.172.in-addr.arpa
29.172.in-addr.arpa
30.172.in-addr.arpa
31.172.in-addr.arpa
corp
d.f.ip6.arpa
home
internal
intranet
lan
local
private
test
Link 3 (docker0)
Current Scopes: none
DefaultRoute setting: no
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 2 (ens192)
Current Scopes: DNS
DefaultRoute setting: yes
LLMNR setting: yes
MulticastDNS setting: no
DNSOverTLS setting: no
DNSSEC setting: no
DNSSEC supported: no
Current DNS Server: 111.152.1.5
DNS Servers: 111.152.1.5
111.152.1.1
DNS Domain: example.com
So my question is why the .consul domain requests are not always sent to the local DNS server.
Solution 1:[1]
Not sure what your systemd version is but there was a bug in systemd's DNS resolver/cache, aka. "systemd-resolved", which resulted in errors like:
Transaction 7037 for <example.com IN A> on scope dns on ens192/* now complete with <success> from network (unsigned).
The bug was fixed in a commit on Feb 14, 2021 which got released in systemd v248.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Adam Romanek |
