Opened 8 years ago

Closed 8 years ago

#1575 closed task (complete)

move data_source.cc:Nsec3Param to libdns++

Reported by: jinmei Owned by: jinmei
Priority: high Milestone: Sprint-20120207
Component: libdns++ Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: NSEC3
Estimated Difficulty: 4 Add Hours to Ticket: 0
Total Hours: 9.0 Internal?: no

Description

We'll need a public interface to calculating a hashed name based on
NSEC3 params in order to support NSEC3 in a wider place (specifically
in the new in memory and database data source implementations).

The more appropriate place would be libdns++, as this calculation
is independent from data sources (and we'll probably need it in future
for the validator). We should also eventually use botan for the hash
implementation and deprecate our in-house copy. But that's probably a
future task.

To me this doesn't seem to have to be a class since it doesn't need
any intermediate state, but that's probably a matter of taste.

We'll also need more tests, and be more careful about corner cases
(such as against unknown algorithms - the current implementation
naively assumes it's sha-1 while accepting a generic algorithm
parameter).

Subtickets

Change History (15)

comment:1 Changed 8 years ago by jelte

  • Estimated Difficulty changed from 0 to 4

comment:2 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20120207
  • Priority changed from major to critical

comment:3 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei
  • Status changed from new to accepted

comment:4 follow-up: Changed 8 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

trac1575 is ready for review.

On thinking about details, I decided to keep it a class. The
rationale was documented in the doxygen comment.

I found one bug in the original implementation: it didn't handle
case-sensitiveness correctly (produced different hash for "com" and
"COM"). The new version fixed it with a test case for it.

I also noted another, slightly related bug in the NSEC3PARAM rdata
implementation: its "from text" constructor didn't handle an empty
salt. It could go to a separate ticket, but I wanted to test the case
for this branch, so I fixed it here, too.

Finally, I provided python binding (0879661). It may be against the
YAGNI principle, but at the same time I wanted to keep the C++ and
Python libraries as compatible as possible. The binding code should
be straightforward (and not that big even if it was written from
scratch), but if the branch looks too big due to it, I'm okay with
deferring it to a separate ticket.

I think it's worth a changelog entry. This is the proposed one:

369.?	[func]		jinmei
	libdns++: a new class NSEC3Hash was introduced as a utility for
	calculating NSEC3 hashes for various purposes.  Python binding was
	provided, too.  Also fixed a small bug in the NSEC3PARAM RDATA
	implementation that empty salt in text representation was
	rejected.  
	(Trac #1575, git TBD)
Last edited 8 years ago by jinmei (previous) (diff)

comment:5 Changed 8 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:6 in reply to: ↑ 4 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Replying to jinmei:

I think it's worth a changelog entry. This is the proposed one:

libdns++: a new class NSEC3Hash was introduced as a utility ofn

I believe there's a typo (should it be utility *for*…?).

Does Doxygen generate the „brief“ descriptions from the first paragraph automatically? It seems so, but I'd prefer having them there, just for the clarity.

The word „alphabets“ looks wrong, I believe an alphabet is a set of characters a language uses to write its words (like Czech alphabet, Greek alphabet, …). Would „letters“ be better? (both in C++ and python docs)

The error message „function must be given NSEC3HASH Rdata“ is possibly correct according to English rules, but it took me like 5 seconds to parse it, as the nominative is at the end. Maybe something „You must provide a NSEC3HASH Rdata to the function“.

I'm little bit worried about the future extendability of this class. Would it be better to have a pointer (shared pointer, or whatever) returned from a static function? That way each algorithm could have its own concrete implementation of the class, if needed, and the function could decide which one should be created. This would also remove the need for having a private implementation class (!NSEC3HashImpl), as the concrete class could be effectively hidden. I understand this implementation is temporary, as good for now and we don't need to support eg. md5. But I'd propose at last changing the interface (to have the function be virtual and allocation being done by the static function), so we can keep it when we revisit it and generalise it.

Also, every time I see the python wrapper file generated form the template, I wonder if we should leave the comments about what should be done to fill in the template in, as this looks as something was forgotten to be done. But that might be for a wider discussion than just this ticket.

Thank you

comment:7 in reply to: ↑ 6 ; follow-up: Changed 8 years ago by jinmei

Replying to vorner:

Thanks for the review.

I think it's worth a changelog entry. This is the proposed one:

libdns++: a new class NSEC3Hash was introduced as a utility ofn

I believe there's a typo (should it be utility *for*…?).

Right, corrected the trac comment directly (so I can simply copy-paste
it later).

Does Doxygen generate the „brief“ descriptions from the first paragraph automatically? It seems so, but I'd prefer having them there, just for the clarity.

Yes, it does, but I have no objection to explicitly adding it for
clarity. Done. (If that's our consensus we should probably document
it as part of guideline).

The word „alphabets“ looks wrong, I believe an alphabet is a set of characters a language uses to write its words (like Czech alphabet, Greek alphabet, …). Would „letters“ be better? (both in C++ and python docs)

Okay, I used "US-ASCII letters", borrowing from RFC4034 and RFC5155.

The error message „function must be given NSEC3HASH Rdata“ is possibly correct according to English rules, but it took me like 5 seconds to parse it, as the nominative is at the end. Maybe something „You must provide a NSEC3HASH Rdata to the function“.

Hmm, I'm okay with revising the message, but in that case I'd rather
be more consistent with messages for this type of error from other
Python built-in libraries. I've changed them that way.

I'm little bit worried about the future extendability of this class. Would it be better to have a pointer (shared pointer, or whatever) returned from a static function? That way each algorithm could have its own concrete implementation of the class, if needed, and the function could decide which one should be created. This would also remove the need for having a private implementation class (!NSEC3HashImpl), as the concrete class could be effectively hidden. I understand this implementation is temporary, as good for now and we don't need to support eg. md5. But I'd propose at last changing the interface (to have the function be virtual and allocation being done by the static function), so we can keep it when we revisit it and generalise it.

Do you mean introducing a base class and defining a factory function
(or method) like this?

class NSEC3HashBase {
public:
    static NSEC3HashBase* create(const rdata::generic::NSEC3PARAM& param);
    virtual std::string calculate(const Name& name) const = 0;
};

(maybe we could name the base class NSEC3Hash and hide the concrete
one in the implementation).

I'm okay with that, and, in fact, as commented in the documentation I
expect we'll probably want to allow tests to define and use fake
derived class so it can hardcode hash values. This framework will be
used for that purpose, too.

But I'm not sure if we want to define derived classes for different
algorithms. Again, as noted, my not-so-fixed intention is to allow
the application to switch parameters (including the algorithm) while
keeping the hash object itself so it can reuse the internal resource
as long as possible. That would be possibly useful for validator,
which would create and hold a single NSEC3Hash object and switch the
parameter for each validation attempt.

Also, every time I see the python wrapper file generated form the template, I wonder if we should leave the comments about what should be done to fill in the template in, as this looks as something was forgotten to be done. But that might be for a wider discussion than just this ticket.

Are you specifically talking about this one (for this branch)?

// This is a template of typical code logic of python class initialization
// with C++ backend.  You'll need to adjust it according to details of the
// actual C++ class.

This should have been removed, and I did it. I tried to remove this
type of comments in the branch, but this one was overlooked. Is there
anything else you noted?

As a general issue, it may help if we mark such comments in the
template like this:

// THIS BLOCK OF COMMENT MUST BE REMOVED FROM THE PUBLISHED VERSION
// This is a template of typical code logic of python class initialization
// with C++ backend.  You'll need to adjust it according to details of the
// actual C++ class.

comment:8 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:9 in reply to: ↑ 7 ; follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Does Doxygen generate the „brief“ descriptions from the first paragraph automatically? It seems so, but I'd prefer having them there, just for the clarity.

Yes, it does, but I have no objection to explicitly adding it for
clarity. Done. (If that's our consensus we should probably document
it as part of guideline).

+1 on that. Should we bring it to the call to announce/discuss?

Hmm, I'm okay with revising the message, but in that case I'd rather
be more consistent with messages for this type of error from other
Python built-in libraries. I've changed them that way.

Yes, true.

Do you mean introducing a base class and defining a factory function
(or method) like this?

class NSEC3HashBase {
public:
    static NSEC3HashBase* create(const rdata::generic::NSEC3PARAM& param);
    virtual std::string calculate(const Name& name) const = 0;
};

(maybe we could name the base class NSEC3Hash and hide the concrete
one in the implementation).

Yes, something like this, and I'd prefer hiding the concrete one.

But I'm not sure if we want to define derived classes for different
algorithms. Again, as noted, my not-so-fixed intention is to allow
the application to switch parameters (including the algorithm) while
keeping the hash object itself so it can reuse the internal resource
as long as possible. That would be possibly useful for validator,
which would create and hold a single NSEC3Hash object and switch the
parameter for each validation attempt.

Wouldn't most of the resources be algorithm-dependant anyway?

But what I propose is not forcing to have a separate one for each algorithm. I propose to have the static factory function to allow future flexibility, allowing us to choose then if we want to have a separate one for each or one for all of them, or something in between.

Also, every time I see the python wrapper file generated form the template, I wonder if we should leave the comments about what should be done to fill in the template in, as this looks as something was forgotten to be done. But that might be for a wider discussion than just this ticket.

Are you specifically talking about this one (for this branch)?

I think it was this one, I can't find any other now.

This should have been removed, and I did it. I tried to remove this
type of comments in the branch, but this one was overlooked. Is there
anything else you noted?

As a general issue, it may help if we mark such comments in the
template like this:

// THIS BLOCK OF COMMENT MUST BE REMOVED FROM THE PUBLISHED VERSION
// This is a template of typical code logic of python class initialization
// with C++ backend.  You'll need to adjust it according to details of the
// actual C++ class.

Hopefully it'll help. Maybe it could even be not a comment, so the compiler could fail in case someone forgets to remove it?

comment:10 in reply to: ↑ 9 Changed 8 years ago by jinmei

Replying to vorner:

class NSEC3HashBase {
public:
    static NSEC3HashBase* create(const rdata::generic::NSEC3PARAM& param);
    virtual std::string calculate(const Name& name) const = 0;
};

(maybe we could name the base class NSEC3Hash and hide the concrete
one in the implementation).

Yes, something like this, and I'd prefer hiding the concrete one.

Updated.

As a general issue, it may help if we mark such comments in the
template like this:

// THIS BLOCK OF COMMENT MUST BE REMOVED FROM THE PUBLISHED VERSION
// This is a template of typical code logic of python class initialization
// with C++ backend.  You'll need to adjust it according to details of the
// actual C++ class.

Hopefully it'll help. Maybe it could even be not a comment, so the compiler could fail in case someone forgets to remove it?

I tried to add something like that. It's basically out of scope of
this ticket, but probably small enough to be a separate task, so I
committed the change to this branch, too.

comment:11 Changed 8 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:12 follow-up: Changed 8 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

Looks OK, please merge.

comment:13 Changed 8 years ago by vorner

  • Total Hours changed from 0 to 1.5

comment:14 in reply to: ↑ 12 Changed 8 years ago by jinmei

Replying to vorner:

Hello

Looks OK, please merge.

Thanks, merge done, closing.

comment:15 Changed 8 years ago by jinmei

  • Resolution set to complete
  • Status changed from reviewing to closed
  • Total Hours changed from 1.5 to 9.0
Note: See TracTickets for help on using tickets.