Opened 8 years ago

Closed 8 years ago

#1212 closed task (complete)

Implement IXFR-in

Reported by: stephen Owned by: jinmei
Priority: very high Milestone: Sprint-20111011
Component: xfrin Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 12 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Implement the protocol handling of IXFR-in. In particular:

  • Generation of an IXFR query packet
  • Handling of UDP response indicating no update available
  • Handling of UDP response containing update
  • Handling of UDP response indicating update available but requiring fallback to TCP to get data.
  • Validation of incoming data integrity.
  • Update of zone with incoming IXFR data

Note: after a more detailed analysis of the amount of work for each component, it may be a necessity to break this ticket down into smaller tickets.

Subtickets

Change History (20)

comment:1 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 12

comment:2 Changed 8 years ago by jelte

  • Milestone changed from Sprint-20110927 to Sprint-20111011

comment:3 follow-up: Changed 8 years ago by jelte

Whoever takes up this ticket should start out with reviewing the current code and RFC1995 and make a quick (and we mean quick) breakdown of the tasks involved, as discussed on the call today.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 8 years ago by jinmei

Replying to jelte:

Whoever takes up this ticket should start out with reviewing the current code and RFC1995 and make a quick (and we mean quick) breakdown of the tasks involved, as discussed on the call today.

I've read RFC1995 and BIND 9's xfrin implementation.

First off, I propose a couple of simplifications toward the next
release:

  • Don't support UDP. In fact, if I understand it correctly even BIND 9 doesn't support IXFR-in over UDP.
  • Don't support IXFR to AXFR fallback for now. To help environment where IXFR doesn't work, introduce a (possibly short term) workaround of an option to disable IXFR (see below)

Here's my proposed task breakdown:

  1. Add support for the "PRESERVE_ORDER" option in libdns++ (and pydnspp) Message::fromWire().
  2. A basic framework to handle RR diffs (add/delete). This will eventually be used for dynamic update support.
  3. Main code logic for IXFR protocol: send IXFR query, receive response, and handle them (it would be nicer if we could break down this further, but I cannot think of any reasonable subtasking for this part)
  4. A small configuration extension: "disable-ixfr" option (preferably per zone basis)

3 depends on 1, 2, and 4, but it should be possible to start work on 3
without waiting for the others to be fully completed. There should be
no other dependencies.

Some details:

As for 1, see the DNS_MESSAGEPARSE_PRESERVEORDER option of BIND 9's
dns_message_parse(). To quote the important part: "If
DNS_MESSAGEPARSE_PRESERVEORDER is set, a separate dns_name_t object
will be created for each RR in the message.". This feature was
necessary for AXFR, too, but we've worked around it with a heuristics
hack so far. It seems to be the time to address it.

As for 2, I propose porting BIND 9's ixfr_putdata(), ixfr_apply(), and
ixfr_commit() (implemented in lib/dns/xfrin.c), and
dns_difftuple_create() and dns_diff_apply() (lib/dns/diff.c). We
introduce a notion of "RR diff", which (conceptually) consists of an
operation mode (add or delete) and an RR (which would be an RRset
object containing only one RDATA in our API). dns_diff_apply()
accepts a list of RR diffs, combining them into RRset diffs (for
efficiency reasons), then make the corresponding changes on the zone
DB. ixfr_putdata() takes an operation (add or delete) and RR from the
main IXFR logic, creates an RR diff object, and maintains it in a
list. For every 100 diffs, it calls ixfr_apply(). ixfr_apply()
starts a new transaction (if it has already started transaction,
continue it) and makes changes for the RR diff list accumulated so far
under the transaction. ixfr_commit() calls ixfr_apply() to apply any
remaining diff and commit the transaction.

As for 3, we'll update the do_xfrin() function of the current
implementation:

            if ret == XFRIN_OK:
                logger.info(XFRIN_AXFR_TRANSFER_STARTED, self._zone_name)
                self._send_query(RRType.AXFR())
                isc.datasrc.sqlite3_ds.load(self._db_file, self._zone_name,
                                            self._handle_xfrin_response)

Unless IXFR is disabled, send IXFR query instead of AXFR, and instead
of doing the sqlite3 specific stuff, implement and perform a
completely separate code logic, where:

  • receive IXFR messages (until the xfrin process completes), parse it with PRESERVE_ORDER mode
  • handle each RR in the responses according to RFC1995. call ixfr_putdata() for each RR, and ixfr_commit() for each "difference" (a single set of deletes and adds for one SOA change). I also suggest checking BIND 9's lib/dns/xfrin.c:xfr_rr() to know any corner cases and hints for tests.

Some additional notes:

  • In our current API add/delete RRsets doesn't check data integrity. For example, "delete" doesn't check if the specified RRset exists. It shouldn't cause a problem as long as this ixfr context is the only writer to the zone and the remote primary server implementation behaves correctly, which wouldn't be unreasonable assumption in practice (in fact, even BIND 9 seems to rely on the accurate behavior of the primary server to some extent). So at least at the moment I propose ignoring the error cases where given "delete" data doesn't exist, etc.
  • The above implementation logic uses a separate transaction for a single IXFR "diff". So if the IXFR message consists of multiple diffs and there's another writer to the DB, the integrity may be broken. We may want to discuss this point separately, but for now I propose implementing it this way. Like the previous point, it will normally not be a problem in practice. Also, BIND 9 seems to use the same logic here (using separate "transactions" for different diffs of a single IXFR message).

comment:5 in reply to: ↑ 4 Changed 8 years ago by jinmei

Replying to jinmei:

  1. Main code logic for IXFR protocol: send IXFR query, receive response, and handle them (it would be nicer if we could break down this further, but I cannot think of any reasonable subtasking for this part)

On further thought we may be able to further divide this into two subparts:
(really) inremental transfer and non incremental (AXFR-compatible) transfer.
The latter part can be forwarded to the existing AXFR code with some minor
tweak in the existing code. These two subtasks are somehow inter-related
and need cooperation, but could be developed in parallel.

comment:6 Changed 8 years ago by jelte

  • Priority changed from major to blocker

comment:7 follow-up: Changed 8 years ago by jelte

In general this looks like a reasonable breakdown.

I do wonder about the commit-per-100-'diffs' though, but maybe i misunderstand that specific part of your description. One per soa->soa I can understand (as this reflects one complete change), but doing intermediate commits may leave the zone in an inconsistent state even without a second writer (if there is a failure after it). OTOH if we don't have onle one big transaction it may protect us from those, but one might run into global database transaction timeouts...

comment:8 in reply to: ↑ 7 Changed 8 years ago by jinmei

Replying to jelte:

In general this looks like a reasonable breakdown.

I do wonder about the commit-per-100-'diffs' though, but maybe i misunderstand that specific part of your description.

It's not *commit*-per-100, but *apply*-per-100. What *apply* does is
to start a new transaction (if not yet started) and apply the diffs
under the transaction (in the database case by issuing SQL statements,
for example), but it doesn't perform "commit" yet.

So, *apply*-per-100 means we delay starting a transaction until we
have 100 diffs (or have full diffs for a single SOA-SOA difference if
they are smaller than 100). This also means if the entire diff
contain multiple RRs of the same RRset we may be able to benefit from
combining them before actually applying the diff to the underlying
data source (this part wouldn't be a big deal for DB-based data
source, though, because we apply the diff per RR anyway, and so the
total number of necessary SQL statements won't be different).

This behavior is simply derived from the BIND 9 implementation (and I
don't know the rationale of the magic number of 100). If we don't
like it or we think it's a premature optimization, we could skip that
in our initial implementation.

One per soa->soa I can understand (as this reflects one complete change), but doing intermediate commits may leave the zone in an inconsistent state even without a second writer (if there is a failure after it). OTOH if we don't have onle one big transaction it may protect us from those, but one might run into global database transaction timeouts...

Exactly, this tradeoff is what I was not sure either. For our initial
experimental implementation (which would be more for a proof of
concept than for production use), either approach should be okay.

comment:9 Changed 8 years ago by jinmei

Created subtasks: #1258, #1259, #1260, #1261, #1262.

This ticket now becomes a meta ticket. When all of the subtasks are
completed this one can also be closed.

comment:10 follow-up: Changed 8 years ago by jelte

There will be one task left for this ticket before we close it: changelog entry/entries, as pointed out by Stephen in #1260

comment:11 in reply to: ↑ 10 Changed 8 years ago by jinmei

Replying to jelte:

There will be one task left for this ticket before we close it: changelog entry/entries, as pointed out by Stephen in #1260

Here is the proposed ChangeLog?.

294.?	[func]		jelte, jinmei, vorner
	b10-xfrin now supports incoming IXFR.  See BIND 10 Guide for
	how to configure it and operational notes.
	(Trac #1212, multiple git merges)

As for the responsible authors, I chose the task takers of the related
subtasks. It heavily relies on data source refactoring tasks, so I
also check past sprint tickets and actually found the same set of
developers worked on those tasks, so I didn't anything special for that
(but if I miss something let me know).

I also updated BIND 10 Guide for IXFR on branch trac1212 and pushed it.
Please review it.

Once #1262 is merged and the changes to the guide is reviewed, I'll merge
the guide and close this ticket.

comment:12 Changed 8 years ago by jinmei

  • Owner set to UnAssigned
  • Status changed from new to reviewing

comment:13 Changed 8 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:14 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Both the changelog and the documentation update look reasonable. But when at it, would you also like to update the b10-xfrin man page? Do we have documentation at any other place as well?

comment:15 in reply to: ↑ 14 Changed 8 years ago by jinmei

Replying to vorner:

Hello

Both the changelog and the documentation update look reasonable. But when at it, would you also like to update the b10-xfrin man page? Do we have documentation at any other place as well?

Ah, good catch. I completely forgot b10-xfrin man.

Hmm, there seems to be some redundancy - we should clarify the
relationship between man pages and the guide, but that's a general
issue, not specific to xfrin. For now, I update the man page with
minimal adjustments (also referring to the guide for some details).

And one more thing: while working on #1209 I noticed I forgot to
complete the detailed log description for XFRIN_GOT_INCREMENTAL_RESP.
So I made that change in this branch, too.

Could you also check that?

comment:16 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:17 Changed 8 years ago by jinmei

I've pushed the "okayed" part (i.e., changelog and guide) to master,
so that the release will include at least these. I'll keep this
ticket (and branch) open for the rest of the changes.

comment:18 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

The rest looks OK too, please merge it as well.

Thanks

comment:19 in reply to: ↑ 18 Changed 8 years ago by jinmei

Replying to vorner:

The rest looks OK too, please merge it as well.

Thanks, merged, and closing.

comment:20 Changed 8 years ago by jinmei

  • Resolution set to complete
  • Status changed from reviewing to closed
Note: See TracTickets for help on using tickets.