Skip to main content
Skip table of contents

3520 - Troubleshooting conflicting DNS resolutions in NIOS

Scenario

External users are having problems resolving hosts in the devops.training.infoblox.com domain, but internal users using your Grid members ibns1 (10.100.0.105) and ibns2 (10.200.0.105) for resolution are working as expected. Please troubleshoot and locate the root cause(s) for this issue.

Estimate Completion Time

  • 30 to 40 minutes

Credentials

Description

Username

Password

URL or IP

Grid Manager UI

admin

infoblox

https://10.100.0.100/

Requirements

  • Administrative access to the Grid

Course References

  • 3011: DNS Troubleshooting Methodology

Lab Initiation

Access jump-desktop

Once the lab is deployed, you can access the virtual machines required to complete this lab activity. To initiate the lab, click on the jump-desktop tile and login to the Linux UI:

Username: training

Password: infoblox

Initiate lab

To initiate the lab, double-click the Launch Lab icon on the Desktop.

Launch Lab

Launch Lab

Choose the lab number from the list and click OK.

After clicking OK, you will see a pop-up message with a brief description of the lab task. If the description looks correct, click Yes to continue lab initiation.

Lab initiation will take a couple of minutes to finish.

Once complete, you will see another pop-up message with the login credentials and the URL for the Grid Manager’s User Interface. Note that the credentials may differ from those from prior labs.

Screenshot 2024-05-06 at 3.16.57 PM.png

Tasks

Task 1: Comparing internal and external answers

Since the reported symptom is that users are getting different answers when querying internal name servers versus external name servers, you should start by sending the same query to both the internal and external name servers and compare the answers.

You may troubleshoot by querying for these names in the devops.training.infoblox.com domain:

  • console.devops.training.infoblox.com

  • server1.devops.training.infoblox.com

  • server2.devops.training.infoblox.com

  • server3.devops.training.infoblox.com

You should send the same query to internal name servers ibns1 (10.100.0.105) and ibns2 (10.200.0.105), and to an external public resolver such as Google (8.8.8.8) or Quad9 (9.9.9.9).

Task 2: Examining Grid configuration for the domain

Examine the DNS configuration on the Grid to find out how the domain devops.training.infoblox.com is configured. Pay special attention to the name servers or NS records.

Task 3: Locating external authoritative name servers for the domain

Use additional options available in dig, modify your query from the first task and send it to external name servers to find out which name server(s) are authoritative for the domain console.devops.training.infoblox.com.

Task 4: Analyzing authoritative name servers

After identifying all of the authoritative name servers, query each of them to isolate the root cause(s).


Solutions

Task 1 Solution: Comparing internal and external answers

For the solution, we will query the name console.devops.training.infoblox.com, and send the same query to 2 different servers:

  • Internal name server ibns1 (10.100.0.105), results listed in Figure 3520-1.

  • External name server Google (8.8.8.8), results listed in Figure 3520-2.

Figure 3520-1: Internal Query Results
CODE
dig @10.100.0.105 console.devops.training.infoblox.com.

; <<>> DiG 9.18.24-0ubuntu0.22.04.1-Ubuntu <<>> @10.100.0.105 console.devops.training.infoblox.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 40841
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1220
; COOKIE: 95e37e7ecb26ca640100000066d83f4ff75cbd11c471df80 (good)
;; QUESTION SECTION:
;console.devops.training.infoblox.com. IN A

;; ANSWER SECTION:
console.devops.training.infoblox.com. 3488 IN A	203.0.113.50

;; Query time: 0 msec
;; SERVER: 10.100.0.105#53(10.100.0.105) (UDP)
;; WHEN: Wed Sep 04 11:06:55 UTC 2024
;; MSG SIZE  rcvd: 109
Figure 3520-2: External Query Results
CODE
$ dig @8.8.8.8 console.devops.training.infoblox.com.

; <<>> DiG 9.18.24-0ubuntu0.22.04.1-Ubuntu <<>> @8.8.8.8 console.devops.training.infoblox.com.
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 62667
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 512
; EDE: 22 (No Reachable Authority): (At delegation devops.training.infoblox.com for console.devops.training.infoblox.com/a)
;; QUESTION SECTION:
;console.devops.training.infoblox.com. IN A

;; Query time: 4072 msec
;; SERVER: 8.8.8.8#53(8.8.8.8) (UDP)
;; WHEN: Wed Sep 04 11:07:48 UTC 2024
;; MSG SIZE  rcvd: 156

Detailed Analysis of Figure 3520-1: Internal Query Results

  • Line 7: We see the NOERROR return code, indicating this name resolved successfully.

  • Line 8: The ra flag indicates that the internal name server (10.100.0.105) performed recursion and obtained the answer from another name server. We don’t know what that name server is until we research the configuration on the Grid.

Detailed Analysis of Figure 3520-2: External Query Results

  • Line 7: We see the SERVFAIL return code, a generic error message that does not reveal much about the nature of the error.

  • Line 8: The ra flag indicates that the external name server (8.8.8.8) performed recursion, attempted to obtain answer from another name server. As we see in the rest of the results, this attempt failed and we did not receive any answer.

  • Line 12: This line provides a hint as to what may have caused the error: it may be related to delegation.

We need to find out how the internal name server (10.100.0.105) and the external name server (8.8.8.8) each got their respective answers. We will do them in the next 2 tasks.

Task 2 Solution: Examining Grid configuration for the domain

Examine the DNS configuration on the Grid to find out how the domain devops.training.infoblox.com is configured. Pay special attention to the name servers or NS records.

From Figure 3520-1: Internal Query Results, we know the server 10.100.0.105 was able to resolve the name by querying another name server or servers. In order to find out which name server(s) was queried, we need to examine the configuration on the Grid.

  1. Login to the GM web interface and navigate to Data Management → DNS → Zones.

  2. Select the zone devops.training.infoblox.com, and click Edit. Notice that the zone type is Forward.

  3. The dialog window appears. We can see again the zone type is a Forward Zone in the dialog window title bar.

  4. Click the Name Servers tab. This shows that this zone is being served by both ibns1 (10.100.0.105) and ibns2 (10.200.0.105). However, the answers did not originate from these servers. Since this is a forward zone, it means ibns1 and ibns2 forwarded the query to some other servers.

  5. Click the Forwarders tab. This shows where the queries for the domain devops.training.infoblox.com are being forwarded to. We can see that they are being forwarded to four servers, pollux.techblue.io (184.170.237.25), mimosa.techblue.io (45.120.106.133), castor.techblue.io (185.64.246.250) and kochab.techblue.io (206.198.151.22).

    Screenshot_2024-09-04_12-20-04-20240904-122210.png

    Screenshot_2024-09-04_12-20-34.png

Our conclusion at this point is that the Grid is forwarding the zone devops.training.infoblox.com to these 2 servers:

  1. pollux.techblue.io (184.170.237.25)

  2. mimosa.techblue.io (45.120.106.133)

  3. castor.techblue.io (185.64.246.250)

  4. kochab.techblue.io (206.198.151.22)

Now we know this is how the internal name servers ibns1 and ibns2 are resolving names in this domain. Next, let’s see how external name servers such as 8.8.8.8 resolve the same name, and how it is different than the internal servers.

An alternative to checking the Grid configuration is to query the internal name servers directly for the NS and glue records of the zone. You may use these commands:
dig @10.100.0.105 devops.training.infoblox.com. NS

dig @10.200.0.105 devops.training.infoblox.com. NS

However, dig alone cannot show all of the configuration details (such as the Use Forwarders Only option). Since we have access to the name servers configuration, it is better to go straight to the configuration.

Task 3 Solution: Locating external authoritative name servers for the domain

Use additional options available in dig, modify your query from the first task and send it to external name servers to find out which name server(s) are authoritative for the domain console.devops.training.infoblox.com.

The +trace option in dig can help us perform a logical or delegation trace, to find out which name servers are authoritative for each domain, as we work out way to resolve console.devops.training.infoblox.com. Figure 3520-3 shows the tracing results.

Figure 3520-3: External Query Tracing Results
CODE
$ dig @8.8.8.8 console.devops.training.infoblox.com. +trace +nodnssec +nocmd
.			87113	IN	NS	g.root-servers.net.
.			87113	IN	NS	j.root-servers.net.
.			87113	IN	NS	e.root-servers.net.
.			87113	IN	NS	l.root-servers.net.
.			87113	IN	NS	d.root-servers.net.
.			87113	IN	NS	a.root-servers.net.
.			87113	IN	NS	b.root-servers.net.
.			87113	IN	NS	i.root-servers.net.
.			87113	IN	NS	m.root-servers.net.
.			87113	IN	NS	h.root-servers.net.
.			87113	IN	NS	c.root-servers.net.
.			87113	IN	NS	k.root-servers.net.
.			87113	IN	NS	f.root-servers.net.
;; Received 239 bytes from 8.8.8.8#53(8.8.8.8) in 4 ms

com.			172800	IN	NS	a.gtld-servers.net.
com.			172800	IN	NS	b.gtld-servers.net.
com.			172800	IN	NS	c.gtld-servers.net.
com.			172800	IN	NS	d.gtld-servers.net.
com.			172800	IN	NS	e.gtld-servers.net.
com.			172800	IN	NS	f.gtld-servers.net.
com.			172800	IN	NS	g.gtld-servers.net.
com.			172800	IN	NS	h.gtld-servers.net.
com.			172800	IN	NS	i.gtld-servers.net.
com.			172800	IN	NS	j.gtld-servers.net.
com.			172800	IN	NS	k.gtld-servers.net.
com.			172800	IN	NS	l.gtld-servers.net.
com.			172800	IN	NS	m.gtld-servers.net.
;; Received 861 bytes from 198.97.190.53#53(h.root-servers.net) in 48 ms

;; communications error to 2001:503:39c1::30#53: timed out
infoblox.com.		172800	IN	NS	ns5.infoblox.com.
infoblox.com.		172800	IN	NS	ns6.infoblox.com.
infoblox.com.		172800	IN	NS	ns7.infoblox.com.
infoblox.com.		172800	IN	NS	ns1.infoblox.com.
;; Received 291 bytes from 192.33.14.30#53(b.gtld-servers.net) in 36 ms

devops.training.infoblox.com. 3600 IN	NS	fractus.training.infoblox.com.
devops.training.infoblox.com. 3600 IN	NS	debilis.training.infoblox.com.
;; Received 169 bytes from 12.23.72.166#53(ns4.infoblox.com) in 84 ms

;; communications error to 203.0.113.212#53: timed out
;; communications error to 203.0.113.213#53: timed out
;; no servers could be reached

Detailed Analysis of Figure 3520-3: External Query Tracing Results

  • Line 1: +trace allows us to perform a logical trace, listing every name server involved in this name resolution; +nodnssec hides DNSSEC and other cryptographic information that we do not need to see for this scenario; +nocmd skips printing back the original command on screen to save us some text to look at.

  • Lines 2 to 15: This lists all the name servers that are authoritative for the root zone. Root is represented by a single dot (.) in the beginning of the line.

  • Lines 17 to 30: This lists all the name servers that are authoritative for the com zone. We can see the domain name com. in the beginning of the line.

  • Lines 33 to 37: This lists all the name servers that are authoritative for the infoblox.com zone. There are 5 name servers: ns1, ns5, ns6 and ns7.

  • Lines 39 to 40: This lists both the name servers that are authoritative for the devops.training.infoblox.com zone. The listed entries are debilis.training.infoblox.com and fractus.infoblox.com. We are not sure what their IP addresses are based on this output, although we can make some educated guesses (see below)

  • Lines 43 to 45: Attempts to resolve the name console.devlops.training.infoblox.com with the IP addresses 203.0.113.212 and 203.0.113.213 failed.

Based on this, we can conclude that the failure to resolve the name console.devlops.training.infoblox.com is related to our repeated failed attempts to reach the IP addresses 203.0.113.212 and 203.0.113.213. But what are these IP addresses? If you remember how NS records and glue records work, you would probably make an educated guess that they are the IP addresses for the NS records debilis.training.infoblox.com and fractus.infoblox.com. So let’s test that theory by performing a few different dig queries.

Resolving Glue Records for the domain devops.training.infoblox.com

According to the name servers on the Internet (root, com, and infoblox.com), devops.training.infoblox.com is hosted on 2 servers, debilis.training.infoblox.com and fractus.infoblox.com. Below are outputs of using dig to find out their IP addresses.

Figure 3520-4: Glue Records Results
CODE
training@jump-desktop:~ $ dig @8.8.8.8 debilis.training.infoblox.com. +short
203.0.113.213
training@jump-desktop:~ $ dig @8.8.8.8 fractus.training.infoblox.com. +short
203.0.113.212

Another way to do this is to query one of the parent zone name servers. Recall from Figure 3520-3 lines 33 to 38, the parent zone infoblox.com has 5 authoritative name servers. We can query any one of them for the delegation information.

Figure 3520-5: Glue Records From Parent Zone
CODE
$ dig @ns1.infoblox.com. devops.training.infoblox.com. NS +nocomment +nocmd +norecurse
;devops.training.infoblox.com.	IN	NS
devops.training.infoblox.com. 3600 IN	NS	debilis.training.infoblox.com.
devops.training.infoblox.com. 3600 IN	NS	fractus.training.infoblox.com.
;; Query time: 4 msec
;; SERVER: 23.96.113.219#53(ns1.infoblox.com.) (UDP)
;; WHEN: Wed Sep 04 11:45:24 UTC 2024
;; MSG SIZE  rcvd: 129

Both Figures 3520-4 and 3520-5 are doing the same thing, just slightly differently: Figure 3520-4 is asking for the recursive or non-authoritative answers from an open resolver (8.8.8.8), while Figure 3520-5 is asking one of the parent authoritative servers (ns1.infoblox.com) directly with no recursion.

Our conclusion at this point is that the servers on the Internet are delegating the zone devops.training.infoblox.com to these two servers:

  1. debilis.training.infoblox.com (203.0.113.213)

  2. fractus.training.infoblox.com (203.0.113.212)

Task 4 Solution: Analyzing authoritative name servers

To summarize our findings, we know:

  1. Grid (internal) is forwarding the domain devops.training.infoblox.com to these 4 servers:

    • pollux.techblue.io (184.170.237.25)

    • mimosa.techblue.io (45.120.106.133)

    • castor.techblue.io (185.64.246.250)

    • kochab.techblue.io (206.198.151.22)

  2. The world (external) is delegating the domain devops.training.infoblox.com to these 2 servers:

    • debilis.training.infoblox.com (203.0.113.213)

    • fractus.training.infoblox.com (203.0.113.212)

So let’s query each of these 6 authoritative servers and compare the output. For all of these queries, we are using the +short option to minimize the output, and the +norecurse option to turn off recursion since we are communicating directly with authoritative servers.

Figure 3520-6: Resolution Attempts Against All 4 Servers
CODE
$ dig @184.170.237.25 console.devops.training.infoblox.com. +short +norecurse
203.0.113.50

$ dig @45.120.106.133 console.devops.training.infoblox.com. +short +norecurse
203.0.113.50

$ dig @185.64.246.250 console.devops.training.infoblox.com. +short +norecurse
203.0.113.50

$ dig @206.198.151.22 console.devops.training.infoblox.com. +short +norecurse
203.0.113.50

$ dig @203.0.113.212 console.devops.training.infoblox.com. +short +norecurse
;; communications error to 203.0.113.212#53: timed out
;; communications error to 203.0.113.212#53: timed out
;; communications error to 203.0.113.212#53: timed out

; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> @203.0.113.212 console.devops.training.infoblox.com. +short +norecurse
; (1 server found)
;; global options: +cmd
;; no servers could be reached

$ dig @203.0.113.213 console.devops.training.infoblox.com. +short
;; communications error to 203.0.113.213#53: timed out
;; communications error to 203.0.113.213#53: timed out
;; communications error to 203.0.113.213#53: timed out

; <<>> DiG 9.18.12-0ubuntu0.22.04.2-Ubuntu <<>> @203.0.113.213 console.devops.training.infoblox.com. +short +norecurse
; (1 server found)
;; global options: +cmd
;; no servers could be reached

Detailed Analysis of Figure 3520-6: Resolution Attempts Against All 4 Servers

  • Lines 1 to 2: This query shows that pollux.techblue.io (184.170.237.25) can resolve the name console.devops.training.infoblox.com.

  • Lines 4 to 5: This query shows that mimosa.techblue.io (45.120.106.133) can resolve the name console.devops.training.infoblox.com.

  • Lines 7 to 8: This query shows that castor.techblue.io (185.64.246.250) can resolve the name console.devops.training.infoblox.com.

  • Lines 10 to 11: This query shows that kochab.techblue.io (206.198.151.22) can resolve the name console.devops.training.infoblox.com.

  • Lines 7 to 15: This query shows that fractus.training.infoblox.com (203.0.113.212) is unreachable.

  • Lines 17 to 25: This query shows that debilis.training.infoblox.com (203.0.113.213) is unreachable.

Why turn off recursion when querying authoritative servers? First of all, most authoritative servers do not accept recursive queries. It is a basic DNS best practice. Secondly, if the target authoritative server cannot resolve the name, we do not want it to ask any other servers to resolve it with a recursive query. For our purpose here, we would rather see the query fail so we can isolate the issue.

Root Cause and Conclusion

  1. The working authoritative servers are pollux.techblue.io (184.170.237.25), mimosa.techblue.io (45.120.106.133), castor.techblue.io (185.64.246.250) and kochab.techblue.io (206.198.151.22).

  2. Internal queries work because members ibns1 (10.100.0.105) and ibns2 (10.200.0.105) are doing conditional forwarding to working servers.

  3. External queries fail because the parent infoblox.com (with 5 name servers) is delegating to the wrong places, debilis (203.0.113.213) and fractus (203.0.113.212).

  4. This is a classic misconfiguration known as lame delegation. The most likely cause is that the child domain devops.training.infoblox.com changed name servers, but did not notify the parent zone to update the corresponding NS and glue records.

  5. Unfortunately, this cannot be easily fixed, you must notify the parent zone (infoblox.com) administrators and coordinate to update the NS and glue records.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.