Opened 7 years ago

Closed 7 years ago

#2568 closed defect (wontfix)

bindctl systest is broken

Reported by: jinmei Owned by:
Priority: medium Milestone:
Component: tests Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 6 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description (last modified by jinmei)

tests/system/bindctl is broken and fails. There seem to be several
issues:

  • there's a timing issue on when b10-auth is ready to answer queries. it's now more tricky now the data source config happens asynchronously in a separate thread.
  • the same thing sometimes happen with b10-stats, too (but that should be irrelevant to asynchronous zone loading of b10-auth, of course).

These could be worked around with some amount of sleeping (there are
already some), but that's not really reliable, and sleeping isn't a
good practice anyway (it will make the tests take unnecessarily
longer).

So I'm going to disable the test for now. To fix this, I'd rather
suggest migrating to a lettuce test. It has a more sophisticated way
to detect the readiness of servers.

Subtickets

Change History (9)

comment:1 follow-up: Changed 7 years ago by naokikambe

As long as I see the below log, secondly dns querying seemed to fail. Wasn't auth ready to answer? I don't think it was related to stats.

http://git.bind10.isc.org/~tester/builder//BIND10-systest/20121219203710-MacOS/logs/systest.out

I:starting server nsx1
I:Checking b10-auth is disabled by default (0)
I:Starting b10-auth and checking that it works (1)
I:failed

http://git.bind10.isc.org/~tester/builder//BIND10-systest/20121219203710-MacOS/logs/files/tests/system/bindctl/nsx1/bind10.run:

2012-12-19 13:03:23.936 DEBUG [b10-auth.auth/74141] AUTH_PACKET_RECEIVED message received:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 19121
;; flags:; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns.example.com. IN A

2012-12-19 13:03:23.936 DEBUG [b10-auth.auth/74141] AUTH_SEND_ERROR_RESPONSE sending an error response (32 bytes):
;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 19121
;; flags: qr; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;ns.example.com. IN A

Furthermore on my environment, it passed even if I reverted the below commit. I don't know why it passed. Was that because zone loading wasn't blocked?

57b40ee [master] disabled bindctl system test, referring to #2568.

comment:2 in reply to: ↑ 1 ; follow-up: Changed 7 years ago by jinmei

Replying to naokikambe:

As long as I see the below log, secondly dns querying seemed to fail. Wasn't auth ready to answer? I don't think it was related to stats.

In this particular case, it's not related to stats. Note the "several
issues" of the ticket description; there is other, different issue of
bindctl, which makes cases like this fail:

I:starting server nsx1
I:Checking b10-auth is disabled by default (0)
I:Starting b10-auth and checking that it works (1)
I:Checking BIND 10 statistics after a pause (2)
I:Stopping b10-auth and checking that (3)
I:Restarting b10-auth and checking that (4)
I:Rechecking BIND 10 statistics after a pause (5)
I:Changing the data source from sqlite3 to in-memory (6)
I:Rechecking BIND 10 statistics after changing the datasource (7)
I:Starting more b10-auths and checking that (8)
I:Rechecking BIND 10 statistics consistency after a pause (9)
I:failed 
I:Stopping extra b10-auths and checking that (10)
I:failed
I:exit status: 1
R:FAIL

But I just realized I misunderstood one thing in the original
description: the '<' and '>' were not part of the search keyword, so
it was irrelevant to the test failure. I'll update the ticket
description.

Furthermore on my environment, it passed even if I reverted the below commit. I don't know why it passed. Was that because zone loading wasn't blocked?

Perhaps. In any case I suspect it's a subtle timing issue and we
should migrate to lettuce test.

comment:3 Changed 7 years ago by jinmei

  • Description modified (diff)

comment:4 in reply to: ↑ 2 ; follow-up: Changed 7 years ago by naokikambe

Replying to jinmei:

Replying to naokikambe:

Furthermore on my environment, it passed even if I reverted the below commit. I don't know why it passed. Was that because zone loading wasn't blocked?

Perhaps. In any case I suspect it's a subtle timing issue and we
should migrate to lettuce test.

After that it failed on my environment. But I inserted one-sec sleep as follow, it never failed.

  • tests/system/bindctl/tests.sh

    diff --git a/tests/system/bindctl/tests.sh b/tests/system/bindctl/tests.sh
    index 75c91de..337cc68 100755
    a b config commit 
    4444quit
    4545' | $RUN_BINDCTL \
    4646       --csv-file-dir=$BINDCTL_CSV_DIR 2>&1 > /dev/null || status=1
     47sleep 1
    4748$DIG +norec @10.53.0.1 -p 53210 ns.example.com. A >dig.out.$n || status=1
    4849# perform a simple check on the output (digcomp would be too much for this)
    4950grep 192.0.2.1 dig.out.$n > /dev/null || status=1

I also think there is a timing issue on systest.

Last edited 7 years ago by naokikambe (previous) (diff)

comment:5 in reply to: ↑ 4 Changed 7 years ago by jinmei

Replying to naokikambe:

Furthermore on my environment, it passed even if I reverted the below commit. I don't know why it passed. Was that because zone loading wasn't blocked?

Perhaps. In any case I suspect it's a subtle timing issue and we
should migrate to lettuce test.

After that it failed on my environment. But I inserted one-sec sleep as follow, I never failed.

Adding a sleep is a well known workaround (btw I needed to insert
multiple sleeps), but it has obvious drawbacks.

comment:6 Changed 7 years ago by shane

Seems like we could add a log entry and check for that to help with the starting problem. Perhaps we can also look for log entries for each zone that is loaded as well.

The lettuce tests do something like check every 0.25 or 0.5 seconds for up to N attempts to see when something starts. That seems a lot better than arbitrary sleeps, although also not perfect.

comment:7 Changed 7 years ago by jwright

  • Component changed from Unclassified to tests
  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:8 Changed 7 years ago by jinmei

This is now moot as #2624 is being resolved. I'm going to close this ticket.

comment:9 Changed 7 years ago by jinmei

  • Resolution set to wontfix
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.