Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#449 closed task (complete)

Implementation of the resolver cache

Reported by: zhanglikun Owned by: jelte
Priority: medium Milestone: R-Team-Sprint-20110222
Component: resolver Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: 8.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description (last modified by jreed)

Create this ticket for resolver cache implementation.

Subtickets

Change History (14)

comment:1 Changed 9 years ago by jreed

  • Component changed from recurser to resolver
  • Description modified (diff)
  • Summary changed from Implementation of the recursor cache to Implementation of the resolver cache

comment:2 Changed 9 years ago by stephen

  • Milestone set to R-Team-Sprint-20110125

comment:3 Changed 9 years ago by stephen

  • Estimated Difficulty changed from 0.0 to 8

comment:4 Changed 9 years ago by zhanglikun

  • Owner changed from zhanglikun to UnAssigned
  • Status changed from new to reviewing

The code is ready for review now.

comment:5 Changed 9 years ago by stephen

  • Owner changed from UnAssigned to stephen

comment:6 Changed 9 years ago by stephen

  • Owner changed from stephen to zhanglikun

Reviewed commit a8746fb17e10db78330cf69b2850515020f5cd56

General
The following apply to all the files examined:

  • Return values should be enclosed in brackets according to the style guide.
  • A number of methods don't have the doxygen tags to describe all the parameters. These should be added.
  • Comments appear to be restricted to 50 characters wide in a number of files. Suggest that 80 characters would be more usual. (This also appears to be the cause of some method declarations being split across multiple lines.)

Comment: there is an inconsistency in how classes make themselves non-copyable. Some inherit from boost::noncopyable, but others take the approach of making the copy constructor and assignment operator private. Both work, but a consistent approach would be good.

cache_entry_key.cc
The function definitions (not including the return type, which according to the style guide should be on a line by itself) could each fit onto a single 80-character line - there does not appear to be the need to split them across multiple lines.

The code is perhaps simple enough to inline in the header file. It could be even simpler using boost::lexical_cast, i.e.:

const std::string
getCachedEntryName(const std::string& namestr, const uint16_t type) {
    return (namestr + boost::lexical_cast<std::string>(type));
}

This would eliminate the ".cc" file as well as remove the overhead of a call/return.

rrset_copy.h
rrset_copy.cc
Both OK.

Comment: strictly speaking, this class does not copy a RRset, it copies the Rdata (and RRSIGs) pointed to by that RRset. However, used in conjunction with the initialization of "rrset_" in the RRsetEntry constructor, the effect is a copy. As you note, this operation logically belongs in the isc::dns::RRset class. Since this is a definite need to copy an RRset, I suggest that you raise this in bind10-dev, and get a consensus to add a copy constructor and assignment operator to that class. (This does not need to be done as part of this task - if consensus is reached, a task for it can be added to the task backlog.)

message_entry.h
Comment: the fact that the RRsetRef object is used here makes me wonder whether this should be part of the basic RRset class hierarchy. The comments in src/lib/dns/rrset.h suggests that the design is not final and should be revisited. Thoughts?

genMessage(): the parameter description in the header suggests that "response" is the first argument and time_now the second.

Presumably the reason that a number of methods are "protected" instead of "private" is to allow for testing? If so, a one-line comment in the header file explaining that would be helpful.

message_entry.cc
getRRsetTrustLevel(): should the second argument (rrset) be passed by reference instead of by value to avoid a copy operation?

getRRsetEntries(): perhaps mark a possible optimisation for the future: as we know that we will put at most entry_count entries in the vector, vector.reserve(entry_count) could be called to allocate enough space prior to entering the loop.

addRRset(): should the rrset_entry_vec and section arguments be passed by reference instead of by value?

addRRset(): perhaps add an "assert" to catch/log a possible programming error if SECTION_QUESTION is passed by mistake?

genMessage(): the check for expiration is with "t > expire_time"; suggest that the ">" should be ">=". (getRRsetEntries() checks that something has not expired with "t < expire_time", which leaves open the question of what happens if "t == expire_time"?)

genMessage(): to check if any of the RRsets in the message have expired, getRRsetEntries() is called. Although only the return status is used, a vector of RRsetEntryPtrs is created in the process which is discarded without being used. The construction/destruction of the vector is additional overhead; perhaps a method should be added purely to check if a message has expired?

getTrustLevel(): in the processing of Message::SECTION_ANSWER, the code assumes that the first RRset in the section is the one that corresponds to the QNAME. Some servers can return CNAME/DNAME chains, but is there anywhere in the protocol that stipulates the ordering of RRsets in the section? It would be safer to iterate through the answer section until the RRset that matches the QNAME is found.

rrset_entry.cc
updateTTL(): coding standards require braces even for single line blocks.

rrset_cache.cc
update(): Need to add a "TODO" that if the RRset being updated is an NS RRset, the NSAS should be updated as well.

message_cache.h
#include <map> is not needed, as the class does not reference a std::map object. (Was this left over from an earlier iteration of the code?)

message_cache.cc
Comment: regarding the suggestion in update() for a better way to replace the entry. In the short term (and for the hash table only) call add() and supply a third argument of "true". A longer-term solution has been proposed in ticket #535.

resolver_cache.h
In the header comments there is a spurious bracket after the name of the exception being thrown.

resolver_cache.cc
Suggest the three-argument lookup() method be split into "lookup()" which checks the class and calls the second part in another method (e.g. "lookupInternal()") to do the actual lookup. This would allow "lookupClosest()" to bypass the "classIsSupported()" call in "lookup()" (which does a binary search on a vector) by calling "lookupInternal()".

Every access to one of the sub-caches (RRset, Message and LocalZone) requires a lookup in a std::map, something that may be done multiple times in a method. I would suggest that support for classes be done at a higher level - call this class something like ResolverClassCache and call the containing class ResolverCache. When a method is called on ResolverCache, it chooses a ResolverClassCache based on the qclass then calls a corresponding method on that. It simplifies the current class by getting rid of the indexing on the sub-caches and reduces map lookups to a single lookup per method call.

local_zone_data.cc
update(): the call to map::insert() followed by a call to map::erase() and map::insert() again can be replaced by the simpler:

rrsets_map_[key] = rrset_ptr;

(This replaces the value if "key" exists, and adds a new element if it doesn't.)

cache_test_util.h
Should really have header comments for these methods.

local_zone_data_unittest.cc
An assertion that the TTL is non-zero (i.e. TTL/2 != TTL) should be added. This is an added check that the data in message_fromWire3 is such that the test truly checks that the RRset has been updated.

message_cache_unittest.cc
Constructor: A qclass of 1 is IN, but RRClass::IN().getCode(), although more verbose, avoids making any assumptions about class values,

testLookup: A qtype of 1 is "A", but more descriptive is using the static method in RRType class (i.e. RRType::A()).

testUpdate: a qtype of 6 is "SOA", more descriptive would be to use RRType::SOA().

resolver_cache_unittest.cc
See comments in message_cache_unittest.cc for specifying qclass and qtype variables.

rrset_entry_unittest.cc
Suggest that a test be added to ensure that when the TTL drops to zero, it stays there and does not wrap round to a high value. (It shouldn't with the current code in rrset_entry.cc, but this is an error that could be made if that code were modified.)

comment:7 Changed 9 years ago by jelte

  • Owner changed from zhanglikun to jelte

comment:8 Changed 9 years ago by jelte

(Note to likun: I'm stealing this ticket, it's starting to block things)

comment:9 follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to stephen

Replying to stephen:

Reviewed commit a8746fb17e10db78330cf69b2850515020f5cd56

General
The following apply to all the files examined:

  • Return values should be enclosed in brackets according to the style guide.
  • A number of methods don't have the doxygen tags to describe all the parameters. These should be added.
  • Comments appear to be restricted to 50 characters wide in a number of files. Suggest that 80 characters would be more usual. (This also appears to be the cause of some method declarations being split across multiple lines.)

I think i got most of this.

Comment: there is an inconsistency in how classes make themselves non-copyable. Some inherit from boost::noncopyable, but others take the approach of making the copy constructor and assignment operator private. Both work, but a consistent approach would be good.

removed the boost::noncopyable ones (when there's two reasonably simple ways to do it i prefer the one that has no dependencies)

cache_entry_key.cc
The function definitions (not including the return type, which according to the style guide should be on a line by itself) could each fit onto a single 80-character line - there does not appear to be the need to split them across multiple lines.

done

The code is perhaps simple enough to inline in the header file. It could be even simpler using boost::lexical_cast, i.e.:

const std::string
getCachedEntryName(const std::string& namestr, const uint16_t type) {
    return (namestr + boost::lexical_cast<std::string>(type));
}

This would eliminate the ".cc" file as well as remove the overhead of a call/return.

true, but I'd like use to try and reduce the usage of boost in our headers, so for now i'm leaving this as is

rrset_copy.h
rrset_copy.cc
Both OK.

Comment: strictly speaking, this class does not copy a RRset, it copies the Rdata (and RRSIGs) pointed to by that RRset. However, used in conjunction with the initialization of "rrset_" in the RRsetEntry constructor, the effect is a copy. As you note, this operation logically belongs in the isc::dns::RRset class. Since this is a definite need to copy an RRset, I suggest that you raise this in bind10-dev, and get a consensus to add a copy constructor and assignment operator to that class. (This does not need to be done as part of this task - if consensus is reached, a task for it can be added to the task backlog.)

+1

message_entry.h
Comment: the fact that the RRsetRef object is used here makes me wonder whether this should be part of the basic RRset class hierarchy. The comments in src/lib/dns/rrset.h suggests that the design is not final and should be revisited. Thoughts?

add to discussion on copy?

genMessage(): the parameter description in the header suggests that "response" is the first argument and time_now the second.

Presumably the reason that a number of methods are "protected" instead of "private" is to allow for testing? If so, a one-line comment in the header file explaining that would be helpful.

we'll have to ask likun

message_entry.cc
getRRsetTrustLevel(): should the second argument (rrset) be passed by reference instead of by value to avoid a copy operation?

ack

getRRsetEntries(): perhaps mark a possible optimisation for the future: as we know that we will put at most entry_count entries in the vector, vector.reserve(entry_count) could be called to allocate enough space prior to entering the loop.

done

addRRset(): should the rrset_entry_vec and section arguments be passed by reference instead of by value?

ack

addRRset(): perhaps add an "assert" to catch/log a possible programming error if SECTION_QUESTION is passed by mistake?

why not, added.

genMessage(): the check for expiration is with "t > expire_time"; suggest that the ">" should be ">=". (getRRsetEntries() checks that something has not expired with "t < expire_time", which leaves open the question of what happens if "t == expire_time"?)

changed

genMessage(): to check if any of the RRsets in the message have expired, getRRsetEntries() is called. Although only the return status is used, a vector of RRsetEntryPtrs is created in the process which is discarded without being used. The construction/destruction of the vector is additional overhead; perhaps a method should be added purely to check if a message has expired?

hmm, not sure if that would cause extra overhead, since that vector is used if nothing has expired. (i don't know which would be the most common path)

getTrustLevel(): in the processing of Message::SECTION_ANSWER, the code assumes that the first RRset in the section is the one that corresponds to the QNAME. Some servers can return CNAME/DNAME chains, but is there anywhere in the protocol that stipulates the ordering of RRsets in the section? It would be safer to iterate through the answer section until the RRset that matches the QNAME is found.

looping over until both qname and type match

found what i think is two errors that were tested wrong (in the case where a CNAME is returned *with* the target added (which is not a cname, the test extpected that target to be RRSET_TRUST_ANSWER_NONAA where i think it should have been RRSET_TRUST_ANSWER_AA)

I don't know the specific way this code is called yet, but i suspect that doing the comparison by pointer is not the right thing here and we should have an equals() in RRset (which we don't), or we should explicelty and loudly put in the api docs that you are only to use this code with the same pointers

rrset_entry.cc
updateTTL(): coding standards require braces even for single line blocks.

done

rrset_cache.cc
update(): Need to add a "TODO" that if the RRset being updated is an NS RRset, the NSAS should be updated as well.

er, but not from here i hope

message_cache.h
#include <map> is not needed, as the class does not reference a std::map object. (Was this left over from an earlier iteration of the code?)

removed

message_cache.cc
Comment: regarding the suggestion in update() for a better way to replace the entry. In the short term (and for the hash table only) call add() and supply a third argument of "true". A longer-term solution has been proposed in ticket #535.

it already called add() with true, i've removed the remove()

resolver_cache.h
In the header comments there is a spurious bracket after the name of the exception being thrown.

removed

resolver_cache.cc
Suggest the three-argument lookup() method be split into "lookup()" which checks the class and calls the second part in another method (e.g. "lookupInternal()") to do the actual lookup. This would allow "lookupClosest()" to bypass the "classIsSupported()" call in "lookup()" (which does a binary search on a vector) by calling "lookupInternal()".

Every access to one of the sub-caches (RRset, Message and LocalZone) requires a lookup in a std::map, something that may be done multiple times in a method. I would suggest that support for classes be done at a higher level - call this class something like ResolverClassCache and call the containing class ResolverCache. When a method is called on ResolverCache, it chooses a ResolverClassCache based on the qclass then calls a corresponding method on that. It simplifies the current class by getting rid of the indexing on the sub-caches and reduces map lookups to a single lookup per method call.

Right, I've renamed ResolverCache? to ResolverClassCache?, and made it specific for one class specified at creation time. Then I created a new ResolverCache? that does its magic (now with a basic vector, some hints as to how to optimise better, but for now good enough i hope).

Some code got moved, some got removed. I am undecided as of yet whether the ResolverClassCache? should go into its own header or whether we should hide it completely, and only provide the 'high-level' header for ResolverCache?. (so for now it's in the same header file, resolver_cache.h)

local_zone_data.cc
update(): the call to map::insert() followed by a call to map::erase() and map::insert() again can be replaced by the simpler:

rrsets_map_[key] = rrset_ptr;

(This replaces the value if "key" exists, and adds a new element if it doesn't.)

done

cache_test_util.h
Should really have header comments for these methods.

Done. Also renamed section_rrset_count to sectionRRsetCount. Is that function worthy to add to libdns++ as well?

local_zone_data_unittest.cc
An assertion that the TTL is non-zero (i.e. TTL/2 != TTL) should be added. This is an added check that the data in message_fromWire3 is such that the test truly checks that the RRset has been updated.

done

message_cache_unittest.cc
Constructor: A qclass of 1 is IN, but RRClass::IN().getCode(), although more verbose, avoids making any assumptions about class values,

done

testLookup: A qtype of 1 is "A", but more descriptive is using the static method in RRType class (i.e. RRType::A()).

done

testUpdate: a qtype of 6 is "SOA", more descriptive would be to use RRType::SOA().

done

resolver_cache_unittest.cc
See comments in message_cache_unittest.cc for specifying qclass and qtype variables.

done

rrset_entry_unittest.cc
Suggest that a test be added to ensure that when the TTL drops to zero, it stays there and does not wrap round to a high value. (It shouldn't with the current code in rrset_entry.cc, but this is an error that could be made if that code were modified.)

done

comment:10 follow-up: Changed 9 years ago by stephen

  • Owner changed from stephen to jelte

Reviewed commit a02e4f3441285f10958b72e2337aa1b6e290fb56

removed the boost::noncopyable ones (when there's two reasonably simple ways to do it
i prefer the one that has no dependencies)

Agreed. The boost::noncopyable appears to be a good idea but as it does involve bringing in a bit of boost into the header (with all that entails), defining the copy constructor and assignment operator as private seems to be the best option. Should we update the coding guidelines with this?

src/lib/cache/cache_entry_key.cc

The code is perhaps simple enough to inline in the header file. It could be even
simpler using boost::lexical_cast, i.e.:
:

true, but I'd like use to try and reduce the usage of boost in our headers, so for
now i'm leaving this as is.

Fair enough about the headers. However even in the .cc file I think the simplification would make the code clearer. But since it works as is there's no urgent need to change it.

src/lib/cache/message_entry.cc
getRRsetTrustLevel(): the addition of the code to locate the correct RRset is a good idea. Perhaps add an assertion after the loop that the RRset is found (i.e. rrset_iter != message.endSection(section))?

genMessage(): to check if any of the RRsets in the message have expired, getRRsetEntries() is
called. Although only the return status is used, a vector of RRsetEntryPtrs is created in the
process which is discarded without being used. The construction/destruction of the vector is
additional overhead; perhaps a method should be added purely to check if a message has expired?

hmm, not sure if that would cause extra overhead, since that vector is used if nothing has
expired. (i don't know which would be the most common path)

You're right, ignore my comment.

I don't know the specific way this code is called yet, but i suspect that doing the comparison
by pointer is not the right thing here and we should have an equals() in RRset (which we
don't), or we should explicelty and loudly put in the api docs that you are only to use this
code with the same pointers

This is true. I think that this is loudly arguing for modification of RRset. The earlier comments noted that there was a need for a copy constructor and assignment operator, and this is suggesting that we also need a comparison operator. As these extend and not alter RRset, I have just gone ahead and added task #568 for it.

src/lib/cache/resolver_cache.cc
ResolverClassCache: debug messages still present in some of the methods.

ResolverClassCache::updateRRsetCache(): the RR type is converted to a string and then checked against the strings "A" or "AAAA". I suggest avoiding the string conversion by comparing the RRType() returned by getType() directly against RRType::A() and RRType::AAAA()? (RRType contains operator==().)

Comment: ResolverCache::lookupClosestRRset() - by changing the "if (!cc)" check to "if (cc)", and enclosing the bulk of the code in the "if" braces, the first "return (RRsetPtr())" can be removed - the case where "cc" is NULL will be handled by the final return statement. (Also - although the optimiser and processor branch prediction logic will probably handle this - the common case where the "cc" is non-null avoids taking one of the branches.) A similar comment applies to the update() methods.

src/lib/cache/rrset_cache.cc

update(): Need to add a "TODO" that if the RRset being updated is an NS RRset,
the NSAS should be updated as well.

er, but not from here i hope

Comment: Where should it be done? The idea here was based on the idea that the resolver code should only go to the NSAS to get the address of a nameserver for a zone, it should not need to worry about maintaining the data in it. So when the resolver updates the cache, it is the cache that updates the NSAS. But I'm open to suggestions.

src/lib/cache/tests/cache_test_util.h

Also renamed section_rrset_count to sectionRRsetCount. Is that function worthy to add to libdns++ as well?

+1

comment:11 in reply to: ↑ 10 Changed 9 years ago by jelte

  • Owner changed from jelte to stephen

Replying to stephen:

Reviewed commit a02e4f3441285f10958b72e2337aa1b6e290fb56

removed the boost::noncopyable ones (when there's two reasonably simple ways to do it
i prefer the one that has no dependencies)

Agreed. The boost::noncopyable appears to be a good idea but as it does involve bringing in a bit of boost into the header (with all that entails), defining the copy constructor and assignment operator as private seems to be the best option. Should we update the coding guidelines with this?

sent a mail to -dev to ask if anyone has any opinion on this

src/lib/cache/message_entry.cc
getRRsetTrustLevel(): the addition of the code to locate the correct RRset is a good idea. Perhaps add an assertion after the loop that the RRset is found (i.e. rrset_iter != message.endSection(section))?

ack, done.


I don't know the specific way this code is called yet, but i suspect that doing the comparison
by pointer is not the right thing here and we should have an equals() in RRset (which we
don't), or we should explicelty and loudly put in the api docs that you are only to use this
code with the same pointers

This is true. I think that this is loudly arguing for modification of RRset. The earlier comments noted that there was a need for a copy constructor and assignment operator, and this is suggesting that we also need a comparison operator. As these extend and not alter RRset, I have just gone ahead and added task #568 for it.

ok cool. Ignoring for now :)

src/lib/cache/resolver_cache.cc
ResolverClassCache: debug messages still present in some of the methods.

removed

ResolverClassCache::updateRRsetCache(): the RR type is converted to a string and then checked against the strings "A" or "AAAA". I suggest avoiding the string conversion by comparing the RRType() returned by getType() directly against RRType::A() and RRType::AAAA()? (RRType contains operator==().)

oh i completely read over that... fixed

Comment: ResolverCache::lookupClosestRRset() - by changing the "if (!cc)" check to "if (cc)", and enclosing the bulk of the code in the "if" braces, the first "return (RRsetPtr())" can be removed - the case where "cc" is NULL will be handled by the final return statement. (Also - although the optimiser and processor branch prediction logic will probably handle this - the common case where the "cc" is non-null avoids taking one of the branches.) A similar comment applies to the update() methods.

ack, changed

src/lib/cache/rrset_cache.cc

update(): Need to add a "TODO" that if the RRset being updated is an NS RRset,
the NSAS should be updated as well.

er, but not from here i hope

Comment: Where should it be done? The idea here was based on the idea that the resolver code should only go to the NSAS to get the address of a nameserver for a zone, it should not need to worry about maintaining the data in it. So when the resolver updates the cache, it is the cache that updates the NSAS. But I'm open to suggestions.

hmm, you are probably right. I was kind of hoping to avoid a circular code dependency (so that the cache would not necessarily need to know about the NSAS and vice versa), but thinking about it, I don't know how to avoid it right now. Added todo.

comment:12 Changed 9 years ago by stephen

  • Owner changed from stephen to jelte

Changes are fine but some lines start with a tab. After fixing these, please merge (I don't need to see the code again).

comment:13 Changed 9 years ago by jelte

  • Resolution set to complete
  • Status changed from reviewing to closed

Merged, closing ticket. On to #491 :)

comment:14 in reply to: ↑ 9 Changed 9 years ago by zhanglikun

Replying to jelte:

Replying to stephen:

message_entry.h
Comment: the fact that the RRsetRef object is used here makes me wonder whether this should be part of the basic RRset class hierarchy. The comments in src/lib/dns/rrset.h suggests that the design is not final and should be revisited. Thoughts?

add to discussion on copy?

No, it's only used by message entry, doesn't make sense to make it be part of RRset.
your wondering may be caused by the similirity between RRsetRef and RRsetPtr, :)

genMessage(): the parameter description in the header suggests that "response" is the first argument and time_now the second.

Presumably the reason that a number of methods are "protected" instead of "private" is to allow for testing? If so, a one-line comment in the header file explaining that would be helpful.

we'll have to ask likun

Done

Note: See TracTickets for help on using tickets.