Opened 9 years ago

Closed 9 years ago

#447 closed task (complete)

MemoryZone::find()

Reported by: vorner Owned by: vorner
Priority: medium Milestone:
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: 0.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

This implements the MemoryZone::find() function. It handles delegation (because it was simple few lines of more code), but does not yet handle CNAME and DNAME.

Subtickets

Change History (13)

comment:1 Changed 9 years ago by vorner

  • Owner changed from vorner to UnAssigned
  • Status changed from new to reviewing

It is ready for review, in branches/trac447, r3955-r3961.

comment:2 Changed 9 years ago by hanfeng

  • Owner changed from UnAssigned to hanfeng

comment:3 follow-up: Changed 9 years ago by jinmei

Hmm, I didn't notice it was assigned to Feng and have looked into the
code a bit. I'll stop my review at this point, but dump the comments
I've had so far:

  • do we need to include <iostream>?

MemoryZoneImpl::add()

  • this method does not provide strong exception guarantee (if it throws after insert(), the inserted node will remain). It may be okay at least for now, because in the case of exception quite likely that the entire zone will be given up. But it may have a tricky implication if/when we want to support dynamic update using this implementation (consider the case where we add a new name and then allocating the data fails with an exception). so I'd suggest adding (doxygen) comment about this point.
  • I'm not sure why we need the tricks with getRelativeName(). As far as I understand it the rbtree should work without trimming the origin. Perhaps it intends to be performance optimization (for find())? If so, I suspect the performance gain wouldn't be much substantial because getting the prefix itself is a quite expensive operation, and I suspect the possible gain doesn't outweigh the additional code complexity.

MemoryZoneImpl::find()

  • unfortunately, the delegation case cannot be that simple. (Using the unittest setting) consider the case we have ns.subdomain.example.org as RDATA of the subdomain.example.org/NS, and an A and/or AAAA glue RRs of ns.subdomain.example.org in the example.org zone. When we ask for ns.subdomain.example.org we should return the delegation (unless we really need the glues for additional section processing), but the current implementation would return an exact match. So, my suggestion is to leave the delegation case open for now and focus on the easier case as originally intended for this ticket.

comment:4 in reply to: ↑ 3 ; follow-up: Changed 9 years ago by vorner

Replying to jinmei:

  • do we need to include <iostream>?

Ups, I forgot to remove that one after some debugging. No, it is not needed.

MemoryZoneImpl::add()

  • this method does not provide strong exception guarantee (if it throws after insert(), the inserted node will remain). It may be okay at least for now, because in the case of exception quite likely that the entire zone will be given up. But it may have a tricky implication if/when we want to support dynamic update using this implementation (consider the case where we add a new name and then allocating the data fails with an exception). so I'd suggest adding (doxygen) comment about this point.

I changed find() in a way empty domain is considered nonexistent. That way we have strong exception guarantee at the interface level (while the internal representation changed, the external behaviour stays the same).

  • I'm not sure why we need the tricks with getRelativeName(). As far as I understand it the rbtree should work without trimming the origin. Perhaps it intends to be performance optimization (for find())? If so, I suspect the performance gain wouldn't be much substantial because getting the prefix itself is a quite expensive operation, and I suspect the possible gain doesn't outweigh the additional code complexity.

I wanted it not because of the finding (but traversing the set of labels twice, once in ZoneTable?, once in the zone itself seems duplicated work), but to save some memory. But it probably is marginal.

Do you think I should remove it in favor of simplicity?

MemoryZoneImpl::find()

  • unfortunately, the delegation case cannot be that simple. (Using the unittest setting) consider the case we have ns.subdomain.example.org as RDATA of the subdomain.example.org/NS, and an A and/or AAAA glue RRs of ns.subdomain.example.org in the example.org zone. When we ask for ns.subdomain.example.org we should return the delegation (unless we really need the glues for additional section processing), but the current implementation would return an exact match. So, my suggestion is to leave the delegation case open for now and focus on the easier case as originally intended for this ticket.

Ah, you're right, I didn't think of that. In that case it doesn't seem to be easilly solved by this way, so I removed the wrong code.

Thanks for the comments.

comment:5 in reply to: ↑ 4 Changed 9 years ago by jinmei

Replying to vorner:

MemoryZoneImpl::add()

  • this method does not provide strong exception guarantee (if it throws after insert(), the inserted node will remain). It may be

[...]

I changed find() in a way empty domain is considered nonexistent. That way we have strong exception guarantee at the interface level (while the internal representation changed, the external behaviour stays the same).

Hmm, since we need to support empty non terminal cases I'm not sure if this can be a remedy for this problem. But in any case, I'd not request it provide the strong guarantee at least at the moment. If we can ensure that, that would be fine; if not I'm okay with simply documenting it for now.

  • I'm not sure why we need the tricks with getRelativeName().

[...]

I wanted it not because of the finding (but traversing the set of labels twice, once in ZoneTable?, once in the zone itself seems duplicated work), but to save some memory. But it probably is marginal.

Ah, okay, if that's the reason, I guess it's marginal because the rbtree essentially does the same compression (normally, only one node consumes the memory for the common origin part). So,

Do you think I should remove it in favor of simplicity?

yes, I personally prefer the simplicity, comparing the motivation and the current level of code complexity.

comment:6 Changed 9 years ago by jinmei

BTW, to be clear, I've not fully reviewed the branch. I simply dumped my intermediate comments I've got until I noticed the review was assigned to Feng. I'm assuming Feng will complete the review.

comment:7 follow-up: Changed 9 years ago by hanfeng

  • Owner changed from hanfeng to vorner
  • Since each node in RBTree store the relative name, so the function
    getRelativeName()
    
    is redundant.
  • For function
    FindResult find(const Name& full_name, RRType type) const
    
    ,it is supposed to modify nothing. so this part code
    if(node->getData()->empty()) {
               node->setData(DomainPtr());
    }
    
    will make the code hard to understand and debug.
  • Currently, for each RBNode, it has two state, empty or non-empty which can be get through
    node->isEmpty()
    
    ,But with Domain as node data, you have the
    if(node->getData()->empty())
    
    check, which also means RBNode is empty, this make the RBNode state not clear. Now we don't have delete for RBTree and MemoryZone?, and the implementation of MemoryZone? should guarantee that if DOMAIN is empty, the data of the node should empty instead with empty map left.


trivial coding style

  • The exception for each module is declared at the begining which make the code reading much easier, but Memory Zone violate it.
  • For function like getRelativeName, if we really need it, since it's a tool function, it's better to be extracted from class definition or declare as static function.


comment:8 in reply to: ↑ 7 ; follow-up: Changed 9 years ago by vorner

  • Owner changed from vorner to hanfeng

Replying to hanfeng:

  • Since each node in RBTree store the relative name, so the function
    getRelativeName()
    

Yes, I removed it.

check, which also means RBNode is empty, this make the RBNode state not clear. Now we don't have delete for RBTree and MemoryZone?, and the implementation of MemoryZone? should guarantee that if DOMAIN is empty, the data of the node should empty instead with empty map left.

OK, I added a note about strong exception safety and removed the hack with empty map. This is maybe slightly more complicated, but the biggest argument is probably mentioned in previous comment (I wasn't aware of need to store empty domains, but if it is needed for something, than let's have them).

trivial coding style

  • The exception for each module is declared at the begining which make the code reading much easier, but Memory Zone violate it.

But these exceptions are not module-level. They are class level. Nobody else throws them than the class, so the class should contain them. I see it better to have names like SomeClass::BadCondition? and OtherClass::BadCondition? than SomeClassBadCondition? and OtherClassBadCondition?. And, in generated documentation, you have them noted on the same page as the class itself, which is better than searching the class list for similar names.

In short, I think the exceptions should be in the class scope. I put them near the function that throws them, but I'm OK with putting them at the top of the class (but there's a written rule that constructors should go first, so below them). Anyway, I'm not sure we had any codestyle rule about position of exception classes. If you think this should be different, we should bring this to wider attention.

  • For function like getRelativeName, if we really need it, since it's a tool function, it's better to be extracted from class definition or declare as static function.

It was out of the public class (and is removed now anyway), so hidden from API. This way, as member function, it had access to member data, which it used and it was cleaner as where it belongs. I consider that more readable and more the OOP way. If it was inside public interface (the .h file), I would see the point, but inside the cpp/pimpl class?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 9 years ago by hanfeng

  • Owner changed from hanfeng to vorner

Replying to vorner:

I wasn't aware of need to store empty domains, but if it is needed for something, than let's have them.

As I know, empty domain is the non-terminal domain

trivial coding style

But these exceptions are not module-level. They are class level. Nobody else throws them than the class, so the class should contain them. I see it better to have names like SomeClass::BadCondition? and OtherClass::BadCondition? than SomeClassBadCondition? and [...]

Sorry for my inaccurate word, module here means MemoryZone?. Just like classes, exception also has hierarchy, more specific exception is raised by specific class. There is no code guildline for exception class definition, but please check the code in src/lib/dns, for example name.h, all the exceptions are only needed by name class. Personally, I think declare the exception on the top is more clear.

In

result::Result add(const ConstRRsetPtr& rrset)

I see you add check to make sure new added rrset should belong to current zone, but I am wondering the steps to add rrset is that, at first, find suitable zone in datasrc first, then add rrset into the zone, I am afraid the same check will be done in two places.

comment:10 Changed 9 years ago by hanfeng

Another thing I want to clarify here, you can check the email in dev with title "rbtree find implementation", so In find function, if the return value is PARTIALMATCH, you have to check whether the return domain is the origin domain(zone name), if so it is same with NXDOMAIN.

comment:11 in reply to: ↑ 9 Changed 9 years ago by vorner

Replying to hanfeng:

But these exceptions are not module-level. They are class level. Nobody else throws them than the class, so the class should contain them. I see it better to have names like SomeClass::BadCondition? and OtherClass::BadCondition? than SomeClassBadCondition? and [...]

Sorry for my inaccurate word, module here means MemoryZone?. Just like classes, exception also has hierarchy, more specific exception is raised by specific class. There is no code guildline for exception class definition, but please check the code in src/lib/dns, for example name.h, all the exceptions are only needed by name class. Personally, I think declare the exception on the top is more clear.

Ok, I'm going to ask on the mailing list. But there are two kind of hierarchies (both for exceptions and other classes) ‒ inheritance hierarchy and namespace hierarchy. They are orthogonal. I don't know why I shouldn't use both of them.

And it might more clear/convenient when you already opened the header file and started reading the code. But more often, you just open the documentation and read the class description, not its header. And having the list of related exceptions on the same page (as nested classes) is more convenient than them just having same prefix in the name.

In

result::Result add(const ConstRRsetPtr& rrset)

I see you add check to make sure new added rrset should belong to current zone, but I am wondering the steps to add rrset is that, at first, find suitable zone in datasrc first, then add rrset into the zone, I am afraid the same check will be done in two places.

The check is maybe redundant in current use. But from design point of view, the class itself can't make any assumption about who and how it is used. Adding the check is just defensive programming. It checks it is used in a correct way without relying on external world, and it is there for the same reason we have asserts in the code and checks for NULL pointers as parameters all over the place, even at places where passing a NULL pointer makes no sense. It makes the code more robust and makes debugging easier in case of problem.

And the check was there as part the getRelativeName before, it was just implicit.

About the PARTIALMATCH, currently it means NXDOMAIN always. Bud I added a comment warning about it.

comment:12 Changed 9 years ago by hanfeng

I think all these points I raised is about coding style and which currently no final decision. Probably we will left it to the face to face meeting, So we can just keep it now.

Except them I think we can merge it into trunk.

comment:13 Changed 9 years ago by vorner

  • Resolution set to complete
  • Status changed from reviewing to closed

Ok, I merged it, lets talk about the exceptions.

Note: See TracTickets for help on using tickets.