Opened 7 years ago

Closed 7 years ago

#2562 closed defect (fixed)

CC_TIMEOUT with no zonemgr when handling NOTIFY

Reported by: jreed Owned by: jinmei
Priority: medium Milestone: Sprint-20130402
Component: b10-auth Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: Medium
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 2 Add Hours to Ticket: 0
Total Hours: 5.66 Internal?: no

Description

Also see #1847 for related information.

If zonemgr is not running (on purpose) but a NOTIFY is received:

2012-12-14 11:00:23.697 DEBUG [b10-auth.auth/2429] AUTH_RECEIVED_NOTIFY received incoming NOTIFY for zone name foo., zone class IN
2012-12-14 11:00:23.697 DEBUG [b10-auth.cc/2429] CC_GROUP_SEND sending message '{ command": [ "notify", { "master": "127.0.0.1", "zone_class": "IN", "zone_name": foo." } ] }' to group 'Zonemgr'
2012-12-14 11:00:23.698 DEBUG [b10-auth.cc/2429] CC_GROUP_RECEIVE trying to receive a message
2012-12-14 11:00:27.705 ERROR [b10-auth.cc/2429] CC_TIMEOUT timeout reading data from command channel
2012-12-14 11:00:27.705 ERROR [b10-auth.auth/2429] AUTH_ZONEMGR_COMMS error communicating with zone manager: Timeout while reading data from cc session

In November on jabber, jinmei suggested:

(08:16:31 PM) jinmei: again, we could do the same for ddns
(08:16:50 PM) jinmei: simply return NOTIMP if the corresonding component isn't running
(08:17:01 PM) jinmei:         } else if (opcode == Opcode::UPDATE()) {
            if (impl_->ddns_forwarder_) {
                send_answer = impl_->processUpdate(io_message);
            } else {
                makeErrorMessage(impl_->renderer_, message, buffer,
                                 Rcode::NOTIMP(), tsig_context);
            }

The BIND 9 way is to return a NOTIFY answer and log about it. BIND 9 will log "not authoritative" if not authoritative. For BIND 10 we don't know if the component is not loaded -- unless we use a loadable configuration module to store this information (like tsig).

I think NOTIMP is fine for now.

Subtickets

Change History (17)

comment:1 Changed 7 years ago by jwright

  • Defect Severity changed from N/A to Medium
  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:2 Changed 7 years ago by jinmei

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:3 Changed 7 years ago by vorner

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

Once #2676 is done, this should be resolved by it. We should just confirm and close.

comment:4 Changed 7 years ago by shane

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

Okay, #2676 is done!

comment:5 Changed 7 years ago by shane

So this ticket is "confirm & close".

comment:6 Changed 7 years ago by shane

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:7 Changed 7 years ago by jelte

  • Estimated Difficulty changed from 4 to 2
  • Milestone changed from Next-Sprint-Proposed to Sprint-20130402

As discussed in planning meeting, I am reducing the estimated time to 2

comment:8 Changed 7 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:9 Changed 7 years ago by jinmei

trac2562 is ready for review.

This actually required some more work than "confirm"; I at least
needed to update noisy (and not precisely correct) log messages.
I also updated the corresponding test case slightly.

And, I also wasted my time by first trying to use the rpcCall
interface just to know it currently doesn't work. To prevent it from
happening for others, I've added notes to the method's documentation
even though it's not directly related to the subject of this branch.

I'd also note that the revised code doesn't return NOTIMP. Since auth
now knows it has the authority for the notified zone, it should be
more appropriate to handle this case just like it's not a secondary
zone.

I guess this probably needs a changelog. Proposal:

594.?	[bug]		jinmei
	b10-auth now handles the case where zonemgr isn't running while
	handling an incoming NOTIFY message from other general errors
	in the communication with zonemgr.  Previously both cases were
	logged at the error level, and resulted in increasing noise when
	zonemgr is intentionally stopped.  Other than the log level
	there is no change in externally visible behavior.
	(Trac #2562, git TBD)

comment:10 Changed 7 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:11 Changed 7 years ago by pselkirk

  • Owner changed from UnAssigned to pselkirk

comment:12 follow-up: Changed 7 years ago by pselkirk

  • Owner changed from pselkirk to jinmei

Code changes look good, builds and tests fine, but I'm having trouble parsing the first sentence of the changelog.

comment:13 in reply to: ↑ 12 Changed 7 years ago by jinmei

Replying to pselkirk:

Code changes look good, builds and tests fine, but I'm having trouble parsing the first sentence of the changelog.

Thanks for the review.

Regarding the changelog, I agree it wasn't readable. How about this?

594.?	[bug]		jinmei
	b10-auth now handles the case where zonemgr isn't running while
	handling an incoming NOTIFY message separately from other general
	errors in the communication with zonemgr.  Previously both cases
	were logged at the error level, and resulted in increasing noise
	when zonemgr is intentionally stopped.  Other than the log level
	there is no change in externally visible behavior.
	(Trac #2562, git TBD)

And, I just realized I forgot to do one more thing in the branch:
adding a lettuce (system) test. And then it revealed the branch
was actually buggy. I should have explicitly set "want_answer" to
true. I've made these changes to the branch. Could you review it?

comment:14 Changed 7 years ago by jinmei

  • Owner changed from jinmei to pselkirk

comment:15 follow-up: Changed 7 years ago by pselkirk

  • Owner changed from pselkirk to jinmei

Code changes look okay.

For the changelog, I think this is what you're trying to say:

        Added special handling for the case where b10-auth receives a NOTIFY
        message, but zonemgr isn't running. Previously this was logged as a
        communications problem at the ERROR level, resulting in increasing
        noise when zonemgr is intentionally stopped. Other than the log
        level there is no change in externally visible behavior.

comment:16 in reply to: ↑ 15 Changed 7 years ago by jinmei

Replying to pselkirk:

Code changes look okay.

Okay, thanks.

For the changelog, I think this is what you're trying to say:

        Added special handling for the case where b10-auth receives a NOTIFY
        message, but zonemgr isn't running. Previously this was logged as a
        communications problem at the ERROR level, resulting in increasing
        noise when zonemgr is intentionally stopped. Other than the log
        level there is no change in externally visible behavior.

Yes, and this looks much better. I've merged the branch, and updated
the changelog with your proposed text. Thanks for the suggestion.

Now closing the ticket.

comment:17 Changed 7 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 0 to 5.66
Note: See TracTickets for help on using tickets.