Opened 8 years ago

Closed 8 years ago

#1372 closed task (complete)

IXFR-out protocol handling: AXFR style IXFR

Reported by: jinmei Owned by: jinmei
Priority: medium Milestone: Sprint-20111122
Component: xfrout Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 9.12 Internal?: no

Description

Maybe we can do the entire protocol handling in a single task,
but in an attempt of keeping tasks as small as possible, I propose
dividing it two sub tasks. This is the second one (first one is #1371).

This task would also include the case where the requested version
is the newest.

Subtickets

Attachments (1)

diff (4.3 KB) - added by jinmei 8 years ago.

Download all attachments as: .zip

Change History (17)

comment:1 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 4

comment:2 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20111122

comment:3 Changed 8 years ago by jinmei

I've made some small updates for the purpose of this task, and
pushed the branch (trac1372) for reference. If anyone needs to
pick up a new task please feel free to steal it (or start a
completely new branch if you like).

comment:4 Changed 8 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:5 Changed 8 years ago by jinmei

trac1372 is ready for review.

To keep the size of branch reasonably small, I left some things
for #1371:

  • pure IXFR case (the original intent of #1371)
  • the case where the requested SOA is the newest one

Also, I found a bug in #1333 and fixed it within this branch.
It's commit d3db538. This fix should be merged anyway (although
it can wait in practice since no one other than IXFR-out code
would use it).

I plan to have a single changelog entry for the set of #1371 and
#1372, so at this moment there's no proposed entry.

comment:6 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:7 follow-up: Changed 8 years ago by vorner

  • Owner changed from UnAssigned to jinmei

Hello

I have few notes about it:

  • The find_zone in tests, the docstring is wrong. I don't see it returning PARTIALMATCH anywhere.
  • In the find method in tests, what is the different between the first if case and the else case?
  • About the tests generally, I noticed quite some code from them was removed. Is it OK to do? Did the functionality tested disappear or did the tests move somewhere else?
  • Is it right the sign-every-nth message is gone and everything is signed now? I'd guess so by reading the changes, is that true? If so, what was the reason? Is it OK to do in this branch?
  • This thing:
    # Make sure the question is valid.  This should be ensured by
    # the auth server, but since it's far from xfrout itself, we check
    # it by ourselves.  A viloation would be an internal bug, so we
    # raise and stop here rather than returning a FORMERR or SERVFAIL.
    
    Will anything at all be returned to the client, or will we just drop the connection in such case? I believe even an internal bug requires a SERVFAIL, doesn't it?
  • You call the xfrout a „daemon“ in the log message descriptions. I believe this might be misleading a little bit, for bind10 itself can run in foreground and xfrout is only part of the system anyway.
  • Is it OK to convert a general std::exception (like std::bad_alloc) to isc.datasrc.Error? Shouldn't it be a generic Exception, then?
  • Should it really throw in case of no diffs table? Shouldn't it say NotImplemented? or pretend there are no diffs? Who creates the table anyway? If it's xfrin, it might mean that no diffs were stored yet just. What will happen in this case anyway? Will it fallback to AXFR-like?

Thank you

comment:8 in reply to: ↑ 7 ; follow-up: Changed 8 years ago by jinmei

Replying to vorner:

Thanks for the prompt review!

I have few notes about it:

  • The find_zone in tests, the docstring is wrong. I don't see it returning PARTIALMATCH anywhere.

Ah, that's a result of naive copy-paste (and adjust). Fixed.

  • In the find method in tests, what is the different between the first if case and the else case?

Another good catch. It resulted in awkward code as a result of
incremental development. I believe the latest version makes more
sense.

  • About the tests generally, I noticed quite some code from them was removed. Is it OK to do? Did the functionality tested disappear or did the tests move somewhere else?

Hmm, as far as I can see few (or even no) tests were removed in the
*body* of this branch. Maybe you are referring to changes in the
first commit (315f499) that merges #1370? (1370 itself wasn't a
dependency for this task but I merge it to minimize the diff at the
merge time). Since we stopped skipping intermediate TSIGs, some
corresponding tests were removed in #1370. (See also the next
bullet)

  • Is it right the sign-every-nth message is gone and everything is signed now? I'd guess so by reading the changes, is that true? If so, what was the reason? Is it OK to do in this branch?

See #1370 (and changelog in master for this). I should have noted
this in the pre-review comment, but forgot to mention that, which may
have confused you. If so, sorry about that.

In any case, it's a result of merge (which is now in master), and not
directly related to this branch.

  • This thing:
    # Make sure the question is valid.  This should be ensured by
    # the auth server, but since it's far from xfrout itself, we check
    # it by ourselves.  A viloation would be an internal bug, so we
    # raise and stop here rather than returning a FORMERR or SERVFAIL.
    
    Will anything at all be returned to the client, or will we just drop the connection in such case? I believe even an internal bug requires a SERVFAIL, doesn't it?

Actually, I thought in this case there is even *no client* as long as
the auth server does the right thing, so this bug should be a more
serious thing. That's why I thought it would make more sense to do
nothing for this case (and the case of unexpected request type). In
general, I agree an internal error would result in SERVFAIL.

But in this specific case I'm also okay with following the general
practice if you prefer it.

  • You call the xfrout a „daemon“ in the log message descriptions. I believe this might be misleading a little bit, for bind10 itself can run in foreground and xfrout is only part of the system anyway.

Hmm, actually I just followed the seeming convention adopted in this
message file. I'm okay with changing daemon to e.g., server, but if
we do so the change should apply throughout the message file.

  • Is it OK to convert a general std::exception (like std::bad_alloc) to isc.datasrc.Error? Shouldn't it be a generic Exception, then?

Hmm, maybe you're right. I adopted the convention used in
DataSourceClient_getUpdater(), but a generic Exception seems to make
more sense in these cases. I've updated getJournalReader. Maybe
we should do the same for getUpdater(), but for now I've not touched it.

  • Should it really throw in case of no diffs table? Shouldn't it say NotImplemented? or pretend there are no diffs? Who creates the table anyway? If it's xfrin, it might mean that no diffs were stored yet just. What will happen in this case anyway? Will it fallback to AXFR-like?

As we discussed at bind10-dev separately, I personally hesitate to add
more code for compatibility hack (like throwing NotImplemented? due to
schema mismatch) in this early stage. In this particular case (adding
a "diffs" table), my suggestion was to add an update script and have
the user apply it. And, for xfrout, I'd rather handle it like other
data source level failure, i.e., returning SERVFAIL (it's (as a
general case with exception) tested in
test_dns_xfrout_start_datasrc_servfail).

comment:9 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:10 in reply to: ↑ 8 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Replying to jinmei:

Replying to vorner:

  • About the tests generally, I noticed quite some code from them was removed. Is it OK to do? Did the functionality tested disappear or did the tests move somewhere else?

Hmm, as far as I can see few (or even no) tests were removed in the
*body* of this branch. Maybe you are referring to changes in the
first commit (315f499) that merges #1370? (1370 itself wasn't a
dependency for this task but I merge it to minimize the diff at the
merge time). Since we stopped skipping intermediate TSIGs, some
corresponding tests were removed in #1370. (See also the next
bullet)

Ah, now I see. I failed to notice there's a merge, I noticed it is on top of other branch only and diffed against that branch. That included the whole merged thing as well.

  • This thing:
    # Make sure the question is valid.  This should be ensured by
    # the auth server, but since it's far from xfrout itself, we check
    # it by ourselves.  A viloation would be an internal bug, so we
    # raise and stop here rather than returning a FORMERR or SERVFAIL.
    
    Will anything at all be returned to the client, or will we just drop the connection in such case? I believe even an internal bug requires a SERVFAIL, doesn't it?

Actually, I thought in this case there is even *no client* as long as
the auth server does the right thing, so this bug should be a more
serious thing. That's why I thought it would make more sense to do
nothing for this case (and the case of unexpected request type). In
general, I agree an internal error would result in SERVFAIL.

But in this specific case I'm also okay with following the general
practice if you prefer it.

Well, what I think is, if such bad request gets to the xfrout, the auth server already didn't do the right thing, it handed the connection. If there's no client, then the answer would end up in a black hole. But if there's and the server forgot to inform it about the bad request (which we can probably expect, as it gave the handed the connection over), the client should be told at last something. In such case, I believe we should give it a SERVFAIL, as that one might mean a server bug as well as anything else and a server bug is exactly what happened.

Or, is there something I don't see in this?

  • You call the xfrout a „daemon“ in the log message descriptions. I believe this might be misleading a little bit, for bind10 itself can run in foreground and xfrout is only part of the system anyway.

Hmm, actually I just followed the seeming convention adopted in this
message file. I'm okay with changing daemon to e.g., server, but if
we do so the change should apply throughout the message file.

OK, if it's there in many instances, I'll bring it to the maillist sometime. No need to hold this branch because of that.

  • Should it really throw in case of no diffs table? Shouldn't it say NotImplemented? or pretend there are no diffs? Who creates the table anyway? If it's xfrin, it might mean that no diffs were stored yet just. What will happen in this case anyway? Will it fallback to AXFR-like?

As we discussed at bind10-dev separately, I personally hesitate to add
more code for compatibility hack (like throwing NotImplemented? due to
schema mismatch) in this early stage. In this particular case (adding
a "diffs" table), my suggestion was to add an update script and have
the user apply it. And, for xfrout, I'd rather handle it like other
data source level failure, i.e., returning SERVFAIL (it's (as a
general case with exception) tested in
test_dns_xfrout_start_datasrc_servfail).

OK, it doesn't matter much now anyway I think. I guess shane and you are the only ones who use the server in some kind of production ;-).

comment:11 in reply to: ↑ 10 ; follow-up: Changed 8 years ago by jinmei

Replying to vorner:

Actually, I thought in this case there is even *no client* as long as
the auth server does the right thing, so this bug should be a more
serious thing. That's why I thought it would make more sense to do
nothing for this case (and the case of unexpected request type). In
general, I agree an internal error would result in SERVFAIL.

But in this specific case I'm also okay with following the general
practice if you prefer it.

Well, what I think is, if such bad request gets to the xfrout, the auth server already didn't do the right thing, it handed the connection. If there's no client, then the answer would end up in a black hole. But if there's and the server forgot to inform it about the bad request (which we can probably expect, as it gave the handed the connection over), the client should be told at last something. In such case, I believe we should give it a SERVFAIL, as that one might mean a server bug as well as anything else and a server bug is exactly what happened.

Or, is there something I don't see in this?

Not necessarily, but I simply couldn't think of a case where these
things happen, except, of course, due to a very stupid bug in
b10-auth, and I thought that we should rather drop any state and
continue (or if it's lightweight even kill it and restart) should this
happen than trying to do anything more especially with someone via
a network.

In general, I personally think we shouldn't try to do too much if the
code identifies our internal bug, while we should return SERVFAIl if
it's a system error such as resource shortage or bad data in database
source.

Maybe we should discuss this as a general policy project-wide.
For this particular case, I'd rather close this ticket faster than
continuing the discussion (after all, this event shouldn't happen and
wouldn't matter much in practice, until/unless we really introduce a
bug in b10-auth). I've created a diff to change the behavior to
returning SERVFAIL (attaching to the ticket). If you still prefer
this behavior and the diff is okay, I'll introduce it and complete the
ticket; otherwise, if you rather defer the discussion in a generic
context, I'll complete this ticket without applying it.

(I believe we don't have any other open issue).

Changed 8 years ago by jinmei

comment:12 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:13 in reply to: ↑ 11 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Replying to jinmei:

Not necessarily, but I simply couldn't think of a case where these
things happen, except, of course, due to a very stupid bug in
b10-auth, and I thought that we should rather drop any state and
continue (or if it's lightweight even kill it and restart) should this
happen than trying to do anything more especially with someone via
a network.

Well, being into philosophy, I'd say that all bugs are stupid. If the bug is intelligent, it becomes a feature.

Anyway, you're right about the fact that this probably will never happen, therefore we don't need to spend too much time worrying about it. So leave it as it is.

Maybe we should discuss this as a general policy project-wide.
For this particular case, I'd rather close this ticket faster than
continuing the discussion (after all, this event shouldn't happen and
wouldn't matter much in practice, until/unless we really introduce a
bug in b10-auth). I've created a diff to change the behavior to
returning SERVFAIL (attaching to the ticket). If you still prefer
this behavior and the diff is okay, I'll introduce it and complete the
ticket; otherwise, if you rather defer the discussion in a generic
context, I'll complete this ticket without applying it.

OK, discussion is fine. I'll try to remember for the next non-planning call.

This ticket looks OK to merge.

comment:14 Changed 8 years ago by vorner

  • Total Hours changed from 0 to 1.75

comment:15 in reply to: ↑ 13 Changed 8 years ago by jinmei

Replying to vorner:

This ticket looks OK to merge.

Thanks, merged with #1371, closing ticket.

comment:16 Changed 8 years ago by jinmei

  • Resolution set to complete
  • Status changed from reviewing to closed
  • Total Hours changed from 1.75 to 9.12
Note: See TracTickets for help on using tickets.