Opened 8 years ago

Closed 8 years ago

Last modified 8 years ago

#1605 closed task (fixed)

introduce special RRset for in memory data source

Reported by: jinmei Owned by: stephen
Priority: medium Milestone: Sprint-20120306
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: auth performance
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 2.5 Internal?: no

Description

This is a subtask for "Pre-establish shortcuts for additional section processing" described in
https://lists.isc.org/pipermail/bind10-dev/2012-January/002985.html
and a followup task of #1604.

We define a dedicated derived class of AbstractRRset used within the
in memory data source implementation (tentatively named RBNodeRRset
in my experimental version, but can be different). It internally
holds the basic RRset (in the form of ConstRRsetPtr) and forward
all public methods to the composed RRset.

On loading the zone, the in memory data source implementation
constructs an RBNodeRRset for each given ConstRRsetPtr and
maintains it as the data source data. It returns the special RRset as
the return value of the find() method.

It will eventually contain the trick of enabling additional section
shortcut, but that will go to a separate task.

Subtickets

Attachments (1)

memory_test.diff (873 bytes) - added by jinmei 8 years ago.

Download all attachments as: .zip

Change History (23)

comment:1 Changed 8 years ago by jelte

  • Milestone changed from Year 3 Task Backlog to Next-Sprint-Proposed

comment:2 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 5

comment:3 Changed 8 years ago by jinmei

  • Milestone changed from Year 3 Task Backlog to Next-Sprint-Proposed

comment:4 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120221
  • Priority changed from major to minor

comment:5 Changed 8 years ago by stephen

  • Owner set to stephen
  • Status changed from new to assigned

comment:6 Changed 8 years ago by stephen

  • Owner changed from stephen to UnAssigned
  • Status changed from assigned to reviewing

I've used the name RBNodeRRset for the new objects. The in-memory data source has been altered to use these objects.

comment:7 Changed 8 years ago by jinmei

  • Priority changed from minor to major

comment:8 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei

comment:9 Changed 8 years ago by jinmei

First off, I made a few trivial changes directly to the branch
(0169ac7). I believe they are okay.

About general design

  • I'd like to hide the definition (or ideally even the existence) of RBNodeRRset more reliably. Conceptually, this class should be a hidden client of the in-memory data source implementation and should never be used by anyone else directly (are we on the same page about this?). I see it'd still be good if we test the class directly, and for that purpose we need to make it visible at least to the test, so, as a compromise for that specific purpose (and basically only for that purpose) I'm okay with defining it in a separate header file. But I'd like to make it very clear that no other applications than tests should use it directly via a comment, and probably by introducing a separate (sub)namespace such as "internal" or "detail".

On a side note, assuming it's conceptually hidden and only visible
to others as a const abstract object as the result of find() (and
its variants) method, not supporting methods that modify the object
should be justifiable. Maybe we want to note that fact.

  • Also considering that RBNodeRRset is conceptually hidden, breaking const_cast in addRRsig/removeRRsig would also be justifiable. But to prevent accidental misuse by a broken application more reliably, I'd rather keep them "not supported", and instead introduce a separate (conceptually private) non virtual methods, say, addRRsigInternal and removeRRsigInternal. The in-memory data source implementation and tests will use the latter.
  • Whether it's virtual or not, we could avoid using const_cast in add/removeRRsig if we copy the given RRset on construction. That's also safer in that we can prevent a surprising possible bug where a passed RRset is reused for some other purpose with a (valid) assumption that it's not magically modified. In practice, however, such a bug is probably unlikely to happen because the RRset would come from the masterLoad() function, which immediately gives up the ownership of the created RRset. Also, not so far in future we'll need to stop retaining the full copy of original RRset anyway (firstly for memory consumption reasons, and possibly also for further response optimization), so this safe measure is probably a short term thing. In the end, I'd leave the decision on this point to you.

rbnode_rrset.h

  • This comment is a bit ambiguous to me:
    /// - Calls that modify the associated RRSIGs of the RRset are allowed (even
    ///   though the pointer is to a "const" object).
    
    That could also be interpreted as modifying the content of the associated RRSIGs. To be more accurate, I'd say e.g. "Calls that add or remove the associated RRSIGs...". (but see also the general design discussion about breaking constness).
  • I'd make the constructor 'explicit'.
  • getUnderlyingRRset() can be non virtual, and I think it's better.

rbnode_rrset_unittest

  • Regarding type parameterisation: we can still do it partially by extracting common patterns with template specialization or just by defining separate tests for separate behaviors as we do for NSEC/NSEC3. But the RBNodeRRset class will evolve more substantially in near future as we add more optimization logic there, and it may or may not make the test sharing more difficult, so until it's clear it's probably okay to keep the tests separate. (This is just a comment).
  • toWireRenderer: I'd suggest getting data (and length) directly from the renderer:
        EXPECT_PRED_FORMAT4(UnitTestUtil::matchWireData, renderer.getData(),
                            renderer.getLength(), &wiredata[0], wiredata.size());
    
    It will be more robust after #1697 (which is ongoing). And, whatever happens in #1697, using renderer should always be correct.
  • With the suggested change above, toWireRenderer and toWireBuffer could be unified into a single template function (if you want).
  • About the added test wire data: if you share test data, I'd rather add a data path and really share the data file than copying it. You should be able to add a path here:
        isc::UnitTestUtil::addDataPath(TEST_DATA_DIR);
    
    (I noticed you modified rrset_toWire2. I was not sure if it was due to an inevitable difference between the two rrset classes or just for convenience. If it's the latter, I'd consider using the same data in the same test pattern)
  • checkSignature: I'd consider using rrsetCheck (defined in lib/testutils/dnsmessage_test.h) or Rdata::compare() to compare two RRs Comparing toText() result may not be very reliable.
  • addRRsigConstRdataPointer: what's the purpose of intermediate 'data' variable? It can be ConstRdataPtr? from the beginning:
        ConstRdataPtr data = createRdata(rrset_siga->getType(),
                                         rrset_siga->getClass(),
                                         RRSIG_TXT);
        rrset_a.addRRsig(data);
        checkSignature(rrset_a);
    

memory_datasrc_unittest

  • Like checkSignature above, I'd suggest considering rrsetsCheck() to compare two sets of RRsets. You can build two vector<RRset>s and pass them to rrsetsCheck().

comment:10 Changed 8 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:11 Changed 8 years ago by vorner

  • Owner changed from stephen to vorner

comment:12 Changed 8 years ago by vorner

  • Owner changed from vorner to stephen

Ups, sorry, I clicked on a wrong ticket. Giving it back O:-).

comment:13 follow-ups: Changed 8 years ago by stephen

  • Owner changed from stephen to jinmei

I'd like to hide the definition (or ideally even the existence) of RBNodeRRset more reliably

I've moved it to an "internal" namespace, although I think the chance of this class being used out of context is negligible.

On a side note, assuming it's conceptually hidden and only visible to others as a const abstract object as the result of find() (and its variants) method, not supporting methods that modify the object should be justifiable. Maybe we want to note that fact.

The modification methods have to be there.

I might be misunderstanding the code, but I thought that when loading a zone, the RRsets and the associated RRSIGs are loaded separately. As an RRset keeps a pointer to the associated RRSIG, if the RRset is loaded first, the object representing it has to be modified when the associated RRSIG is loaded in order to store the pointer to the RRSIG.

The specification for this change called for the underlying RRSet to be stored and made accessible through a ConstRRsetPtr; therefore the "const" nature of the RRset has to be violated to store the RRSIG.

introduce a separate (conceptually private) non virtual methods, say, addRRsigInternal and removeRRsigInternal. The in-memory data source implementation and tests will use the latter.

I suggest leaving that until the code is modified to use the short-cut additional section processing. At the moment, RBNodeRRset is a drop-in replacement for RRset, which makes testing at this intermediate stage easier.

Whether it's virtual or not, we could avoid using const_cast in add/removeRRsig if we copy the given RRset on construction.

True (although in that case we should change the specification to state that the underlying RRset is pointed to by an RRsetPtr object). However, as you note, the RRset is relinquished by the loader as soon as it is passed into the in-memory data source. Given this, and that the code will soon be modified to stop retaining the original RRset, I suggest that it's not worth the effort of changing it.

This comment is a bit ambiguous to me

On re-reading it, it is ambiguous to me as well. Altered.

I'd make the constructor 'explicit'.

Done. As an aside, I'm sure that many classes could benefit from an explicit constructor - should we start a discussion thread for this?

getUnderlyingRRset() can be non virtual.

Done.

toWireRenderer: I'd suggest getting data (and length) directly from the renderer

OK, but in that case we may want to open a ticket to change src/lib/dns/tests/rrset_unittest.cc, as that's where these tests were taken from.

With the suggested change above, toWireRenderer and toWireBuffer could be unified into a single template function (if you want)

Good point. Done.

About the added test wire data: if you share test data, I'd rather add a data path and really share the data file than copying it.

rrset_toWire2 has been modified because the original version included data from both A and NS records. (The original test checks that rendering an A RRset followed by an NS RRset gives wire data containing both.) This test is really only to check that RBNodeRRset::toWire() forwards the call to RRset::toWire() object, so I removed the unnecessary part of the original code.

As to the duplication of data, again I did think about that. But because only one is file duplicated - and because sharing the data means that the tests are dependent on the test data in a different module - I did not do so. I think the introduction of a cross-module dependency of test data would cause more problems than it would solve.

(If multiple files are duplicated, that is another matter. But in that case I would suggest moving the duplicated test data into a common third directory.)

checkSignature: I'd consider using rrsetCheck

checkSignatures replaced by a call to rrsetCheck.

addRRsigConstRdataPointer: what's the purpose of intermediate 'data' variable?

Removed.

Like checkSignature above, I'd suggest considering rrsetsCheck() to compare two sets of RRsets.

Done.

Suggest the following for the ChangeLog entry:

xxx.	[func]        stephen
	Add special RRset class to in-memory data source in preparation for
        changes concerning performance enhancements.
	(Trac #1695, git xxx)

comment:14 in reply to: ↑ 13 Changed 8 years ago by jinmei

Replying to stephen:

On a side note, assuming it's conceptually hidden and only visible to others as a const abstract object as the result of find() (and its variants) method, not supporting methods that modify the object should be justifiable. Maybe we want to note that fact.

The modification methods have to be there.

I might be misunderstanding the code, [...]

What I meant is that we could extend this note:

/// - Calls to methods that change attributes of the underlying RRset (such as
///   TTL or Name) cause an exception to be thrown.  The in-memory data source
///   does not allow modification of these attributes.

like this:

///   does not allow modification of these attributes.  In theory, it is a bad
///   practice in that it doesn't preserve the assumed behavior of the base
///   class.  In practice, however, it should be acceptable because this
///   class is effectively hidden from applications and will only be given
///   to them as a const pointer to the base class via find() varints.
///   So the application cannot call non const methods anway unless it
///   intentionally breaks the constness.

As previously mentioned, I'd also add an explicit note in comments
that this class is basically hidden:

/// does not have to be altered if encapsulation, rather than inheritcance, is
/// used.
///
/// \note This class is exposed in this separate header file so that
/// test code can refer to its definition, and only for that purpose.
/// Otherwise this is essentially a private class of the in-memory
/// data source implementation, and an application shouldn't directly
/// refer to this class.

I'd make the constructor 'explicit'.

Done. As an aside, I'm sure that many classes could benefit from an explicit constructor - should we start a discussion thread for this?

I thought we basically agreed about this practice, and do it in many
cases already (I know there are exceptions, but my understanding is
it's not intentional, just overlooked). I even thought it's in the
coding guideline, but I just checked and noticed it's not. So, yes,
it may be good to have a discussion (or just "confirmation" if my
understanding is correct) and reflect the result in the guideline.

Suggest the following for the ChangeLog entry:

xxx.	[func]        stephen
	Add special RRset class to in-memory data source in preparation for
        changes concerning performance enhancements.
	(Trac #1695, git xxx)

I don't necessarily see the need for a changelog entry for it because
the change is inherently invisible to users. But I don't oppose to
adding it, and if adding it the suggested text is okay for me (except
that the ticket number should be #1605, not 1695:-).

About the revised code:

  • for the iterator test I'd suggest the attached patch. Hardcoding '3' is fragile in case we eventually add more records in the test. Also rrsetsCheck() checks if the number of RRsets is the same internally, so you don't have to do it by hand.
  • in addRRSIG tests of rbnode_rrset_unittest, why do you use getUnderlyingRRset()?
        rrsetCheck(rrset_siga, rrset_a.getUnderlyingRRset()->getRRsig());
    
    Is there a reason it cannot be rrset_a.getRRsig()? (At least the latter seems to be a more straightforward conversion from the original code).

Changed 8 years ago by jinmei

comment:15 Changed 8 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:16 in reply to: ↑ 13 Changed 8 years ago by jinmei

Replying to stephen:

Forgot to comment on one more point:

toWireRenderer: I'd suggest getting data (and length) directly from the renderer

OK, but in that case we may want to open a ticket to change src/lib/dns/tests/rrset_unittest.cc, as that's where these tests were taken from.

This was already done in #1697 (actually, not only for the rrset test,
but throughout the libdns++ tests; otherwise the tests didn't even
compile with #1697).

comment:17 Changed 8 years ago by stephen

  • Owner changed from stephen to jinmei

What I meant is that we could extend this note:

Done.

for the iterator test I'd suggest the attached patch.

Done - it is more elegant.

Is there a reason it cannot be rrset_a.getRRsig()

No - replaced. Don't know what I was thinking of when I wrote it :-(

comment:18 Changed 8 years ago by jinmei

It now looks okay. Please feel free to merge.

comment:19 Changed 8 years ago by jinmei

  • Owner changed from jinmei to stephen

comment:20 Changed 8 years ago by jinmei

  • Total Hours changed from 0 to 2.5

comment:21 Changed 8 years ago by stephen

  • Resolution set to fixed
  • Status changed from reviewing to closed

On reflection, you're right about the change log entry, so I have not added one.

Merged in commit 13fe9c1bdcc2e0a1c8947a328429e6f301c6fb3e

comment:22 Changed 8 years ago by jinmei

(I've noticed trac1605 wasn't removed from the shared repo, so I just did it)

Note: See TracTickets for help on using tickets.