Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#1305 closed task (complete)

auth NSEC support: some more updates to data source

Reported by: jinmei Owned by: jinmei
Priority: medium Milestone: Sprint-20111025
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description (last modified by jinmei)

This is subtask 1 of #1244. Many others depend on this, so this one
must be completed quickly.

Subtickets

Change History (14)

comment:1 Changed 8 years ago by jinmei

  • Description modified (diff)

comment:2 Changed 8 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:3 Changed 8 years ago by jinmei

The trac1305 branch is ready for review.

This branch basically does what was described in the parent ticket
(#1244) in a pretty straightforward way, but I guess I need to make
notes about a few things.

  • This branch does not introduce the new "WILDCARD_EMPTY" code. On a closer look, it seemed we actually don't need it - in the end we should be able to handle both "EMPTY" and "NXRRSET" cases in the same way (although why they can be same is not that straightforward as it might look). I'll update the corresponding followup tickets about this.
  • The WILDCARD_CNAME code was added, but right now b10-auth does not handle it (even for the non DNSSEC case). It doesn't do real harm in practice (yet), however, because the only available datasrc for b10-auth right now (i.e. in-memory) doesn't (yet) return this new code anyway.
  • On a related note to the previous bullet, I guess we should probably separate the notion of wildcard related result and other result codes, rather than integrating everything in the same enum space. But that would be a separate discussion.
  • I made straightforward updates to the python wrapper, but the test is not so detailed as the C++ tests. This is actually not specific to this ticket - before we start this task the python tests are much more limited than C++ versions. I believe we should eventually port the C++ versions of tests (maybe also porting the mock database accessor) for the python wrappers, but I'd suggest doing it in a separate ticket.

comment:4 Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:5 Changed 8 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:6 follow-up: Changed 8 years ago by stephen

  • Owner changed from stephen to jinmei

src/lib/datasrc/database.cc
Not part of the changes here, but can't the "accessor" argument of DatabaseClient be passed by reference and so save the need to copy the shared_ptr when the constructor is called?

In the following code:

} else if ((options & NO_WILDCARD) != 0) {
    // If wildcard check is disabled, terminate the search with
    // NXDOMAIN.
    if (dnssec_data && !records_found) {
        get_cover = true;
    }

...there is no need to check that records_found is false here, as this block of code is inside an

} else if (!records_found) {

... clause and nothing appears to modify records_found since that check.

Also, although the comment doesn't quite reflect the purpose of the following code. Something like:

// If wildcard check is disabled, the search will ultimately terminate
// with NXDOMAIN. If DNSSEC is enabled, flag that we need to get the
// NSEC records to prove this.

... may be better.

src/lib/datasrc/tests/database_unittest.cc
Comment refers to WILDCARD_EMPTY; the actual status is WILDCARD_NXRRSET.

comment:7 in reply to: ↑ 6 Changed 8 years ago by jinmei

Replying to stephen:

Thanks for the review.

src/lib/datasrc/database.cc
Not part of the changes here, but can't the "accessor" argument of DatabaseClient be passed by reference and so save the need to copy the shared_ptr when the constructor is called?

Do you mean passing as DatabaseAccessor?& or as boost::shared_ptr<DatabaseAccessor?>&?
We couldn't use the former in this case because the caller is a
datasource-specific factory method and will immediately release the
ownership of the accessor object. In the latter case we'll have to
dereference it to acquire the ownership anyway for the same reason,
so I suspect the actual benefit of saving the copy is marginal (in
this specific usage, in particular, I guess the compiler will combine
the multiple copy constructions involved in the function call and the
member variable initialization, and the actual benefit would actually
be zero).

In general, I'd be careful about passing a shared pointer in the form
of reference because it could often lead to break the "shared" nature
of the object. If the called function handles the passed reference
carelessly (e.g., keep storing it somewhere in the form of reference)
it can easily lead to dangling ownership. But such a discussion would
be far beyond the scope of this ticket.

In the following code:

} else if ((options & NO_WILDCARD) != 0) {
    // If wildcard check is disabled, terminate the search with
    // NXDOMAIN.
    if (dnssec_data && !records_found) {
        get_cover = true;
    }

...there is no need to check that records_found is false here, as this block of code is inside an

} else if (!records_found) {

... clause and nothing appears to modify records_found since that check.

Ah, good catch, fixed. I forgot to leave a note to reviewer about
this part: there's some code duplication between this case and the end
of the final 'else' block. But I didn't try to unify them at this
point - I thought it made more sense to clean these up in #1198.

Also, although the comment doesn't quite reflect the purpose of the following code. Something like:

// If wildcard check is disabled, the search will ultimately terminate
// with NXDOMAIN. If DNSSEC is enabled, flag that we need to get the
// NSEC records to prove this.

... may be better.

Yeah, as I fixed the previous one I saw the need for revising the
comment, too, so I did it, but I like your version better. Updated.

src/lib/datasrc/tests/database_unittest.cc
Comment refers to WILDCARD_EMPTY; the actual status is WILDCARD_NXRRSET.

Good catch, fixed.

comment:8 Changed 8 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:9 follow-up: Changed 8 years ago by stephen

  • Owner changed from stephen to jinmei

Do you mean passing as DatabaseAccessor?& or as boost::shared_ptr<DatabaseAccessor?>&?

The latter.

In the latter case we'll have to dereference it to acquire the ownership anyway for the same reason, so I suspect the actual benefit of saving the copy is marginal

Well, there is work to do in incrementing and decrementing a reference count. If passed by value, a copy is made within the function (causing an increment of the reference count) then that copy is passed as argument to the accessor_ copy constructor. Finally, the copy is discarded (a decrement of the reference count). You are right in the the cost saving is marginal, but we use shared pointers extensively in BIND 10 and all the costs mount up.

(in this specific usage, in particular, I guess the compiler will combine the multiple copy constructions involved in the function call and the member variable initialization, and the actual benefit would actually be zero).

Bear in mind that a copy constructor could contain code unrelated to the copy (which may be a bad idea but is not prohibited by the C++ standard), so if the compiler were to do that it could be break required behaviour. However, since the boost code is mainly headers, it could be that the compiler is clever enough to see that there are no external calls and to optimise the copy away. But we don't know that.

In general, I'd be careful about passing a shared pointer in the form of reference because it could often lead to break the "shared" nature of the object. If the called function handles the passed reference carelessly (e.g., keep storing it somewhere in the form of reference) it can easily lead to dangling ownership. But such a discussion would be far beyond the scope of this ticket.

I'm not sure I follow your line of reasoning here. A function keeping a pointer (or reference) to a passed-in object if there is no guarantee that the object will remain in existence when the pointer/reference is next used is a problem regardless of the object being passed in. In this case, there is no guarantee about the lifetime of the shared_ptr object (other than it will exist for the duration of the call to this method), so the function shouldn't store a pointer or reference to it.

I'll leave it up to you whether you change the method signature. The code is OK to merge.

comment:10 in reply to: ↑ 9 Changed 8 years ago by jinmei

Replying to stephen:

Do you mean passing as DatabaseAccessor?& or as boost::shared_ptr<DatabaseAccessor?>&?

The latter.

In the latter case we'll have to dereference it to acquire the ownership anyway for the same reason, so I suspect the actual benefit of saving the copy is marginal

Well, there is work to do in incrementing and decrementing a reference count. If passed by value, a copy is made within the function (causing an increment of the reference count) then that copy is passed as argument to the accessor_ copy constructor. Finally, the copy is discarded (a decrement of the reference count). You are right in the the cost saving is marginal, but we use shared pointers extensively in BIND 10 and all the costs mount up.

(in this specific usage, in particular, I guess the compiler will combine the multiple copy constructions involved in the function call and the member variable initialization, and the actual benefit would actually be zero).

Bear in mind that a copy constructor could contain code unrelated to the copy (which may be a bad idea but is not prohibited by the C++ standard), so if the compiler were to do that it could be break required behaviour. However, since the boost code is mainly headers, it could be that the compiler is clever enough to see that there are no external calls and to optimise the copy away. But we don't know that.

I don't deny your points, but in terms of performance considerations
I'd say

  • even dereferencing of a reference of a shared pointer is costly if we care about the marginal overhead of copying a shared pointer object because it involves a function call. So, if the overhead of method calls on a shared pointer object really matters (there may be some such cases like in the middle packet processing), I'd rather pass a reference to the pointed object at the higher level of code (in this example, it would be DatabaseAccessor&).
  • I'd generally rather avoid premature optimizations. I don't object to making this level of optimization when we are at the stage of performance tuning and if we know the accumulated overhead around shared pointers is a real bottleneck via benchmarking. But it's not now.
  • In any case, this specific context isn't performance sensitive at all - the accessor is expected to be created during initialization of the server and will be used for quite a long period. In some cases we'll also use ephemeral accessors, but in that case the performance wouldn't matter much anyway.

But at the same time, I agree we tend to use shared pointers even when
they don't have to be so. I don't think it good, not necessarily for
performance, but rather due to other disadvantages of shared pointers
such as making it unclear who is actually responsible for the
ownership of the object. So, as a matter of cleanup I think it a good
idea to revisit our use of shared pointers. But that will be a rather
larger task outside the scope of this ticket (and, as you noticed,
this interface is not actually part of this branch).

(In this particular, IMO the best would be to have the factory
maintain the accessor in the form of an auto_ptr and pass
it as in that form or bare pointer, and let DatabaseClient? maintain it
in the form of scoped pointer. This way we handle the things safely
with minimally powerful tool. But auto_ptrs are often error prone
also, so it may not be a good practice in the end either...)

In general, I'd be careful about passing a shared pointer in the form of reference because it could often lead to break the "shared" nature of the object. If the called function handles the passed reference carelessly (e.g., keep storing it somewhere in the form of reference) it can easily lead to dangling ownership. But such a discussion would be far beyond the scope of this ticket.

I'm not sure I follow your line of reasoning here. A function keeping a pointer (or reference) to a passed-in object if there is no guarantee that the object will remain in existence when the pointer/reference is next used is a problem regardless of the object being passed in. In this case, there is no guarantee about the lifetime of the shared_ptr object (other than it will exist for the duration of the call to this method), so the function shouldn't store a pointer or reference to it.

My point is that passing a reference is more error prone, and the
mixture of (effectively) passing a shared pointer (which implicitly
indicates the object is assumed to be shared) and actually passing it
in the form of reference will increase the risk of errors. In
principle, you're right that the fact that something is passed by
reference indicates the caller controls the ownership, but I'm afraid
the fact that the referenced object is a shared pointer can give the
callee a false sense of safety. This is an exaggerated bad example:

class Foo {
public:
    Foo(shared_ptr<Bar>& barptr) : barptr_(barptr) {
      // and barptr_ will be used throughout the lifetime of Foo.
    }
private:
    shared_ptr<Bar>& barptr_;
};

but I can imagine a more subtle, but essentially the same, bugs like
this. As I said above, if we are sure that the caller should have the
ownership of the original object, I'd rather pass a reference (or,
suboptimally a pointer if it can be NULL) of the object that the
shared pointer points to, not a reference of the shared pointer; if
the ownership is expected to be shared, I'd rather pass it as a real
shared pointer to clarify the intent, rather than hoping that the
callee understands the intent and behaves accordingly. And,
personally, I don't like to sacrifice the clarity of the intent for
possible performance reasons unless we know the performance bottleneck
is real and substantial in our main usage.

I'll leave it up to you whether you change the method signature. The code is OK to merge.

So...for various reasons I won't touch this part and merge the rest of
the code for now. Again, I don't mind revisiting how to use shared
pointer throughout of our code. If you want please create a separate
ticket or trigger a discussion at the dev list.

comment:11 Changed 8 years ago by jinmei

Branch merged, closing ticket.

comment:12 Changed 8 years ago by jinmei

  • Resolution set to complete
  • Status changed from reviewing to closed

comment:13 Changed 8 years ago by jinmei

  • Estimated Difficulty changed from 0 to 5

comment:14 Changed 8 years ago by jinmei

added the time estimate retroactively.

Note: See TracTickets for help on using tickets.