Opened 9 years ago

Closed 8 years ago

#781 closed task (complete)

Define cryptographic API

Reported by: stephen Owned by: jelte
Priority: very high Milestone: Sprint-20110503
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: tsig
Estimated Difficulty: 6.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

We need to define a thin layer above an existing cryptographic library so that if required, we can change libraries at time time in the future.

As well as the interface into the cryptographic operations, the API should also take account of the possible use of HSMs, possibly by defining a PKCS#11-style interface. (It should also allow for the use of multiple HSMs at the same time; typical use of this would be to roll a key from one HSM to another when the first has reached the end of its life.)

Subtickets

Attachments (1)

tsigkey.cc.diff (2.3 KB) - added by jinmei 9 years ago.

Download all attachments as: .zip

Change History (38)

comment:1 Changed 9 years ago by jelte

correction: It should *implement* pkcs#11, not define it. The API of this library does not necessarily have to have anything that is pkcs#11-related (well it would probably need some abstract key concept)

comment:2 Changed 9 years ago by shane

  • Milestone changed from Year 3 Task Backlog to Sprint-20110419
  • Priority changed from major to critical

comment:3 Changed 9 years ago by shane

  • Estimated Difficulty changed from 0.0 to 6

comment:4 Changed 9 years ago by shane

  • Priority changed from critical to blocker

comment:5 Changed 9 years ago by jelte

  • Owner set to jelte
  • Status changed from new to assigned

comment:6 Changed 9 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from assigned to reviewing

For TSIG support, we only need HMAC sign and verify calls. So what I did for this ticket was create a libcrypto.so, which right now only has the functions signHMAC() and verifyHMAC() with botan as a backend and no dynamic loading (which we'll need for pkcs#11) or fancy initialization. So it's merely two functions, and not a 'full' api yet (following the methodology of not adding code we don't use yet).

I've added a lot of tests for the three algorithms we support right now (hmac-md5, hmac-sha1 and hmac-sha256), taken from two rfcs.

The TSIGKey in libdns++ can now be constructed from a string (<name>:<secret>[:algorithm]), and has a toText() which returns name:secret:algorithm.

comment:7 Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to jinmei

comment:8 follow-up: Changed 9 years ago by jinmei

Some initial feedback:

First, I made a few trivial editorial cleanups and pushed them to the
branch (commit 4f1d8d0).

Second, it doesn't compile from the scratch because crypto_unittests
relies on some libdns++ header files (explicitly/implicitly), some of
which are generated at compile time of libdns++. (specifically,
rrclass.h)

This particular problem could be solved in some way, but I think it's
actually one aspect of a bigger design discussion: does it make sense
to rely on DNS specific notions from the crypto (interface) API? More
specifically, does it make sense to use TSIGKey and OutputBuffer? also
as input parameters to signHMAC()/verifyHMAC()?

Personally, I'd decouple independent things as much as possible. In
this specific case it seems that input/result data can simply be
opaque binary data and can be represented as (e.g.) a pair of a
pointer and length, and/or a std vector object. Representing a key is
not that trivial because we have to think about how to represent the
algorithm, but if we want to keep the crypto API separate from
libdns++, I think it makes more sense to introduce a generic
representation (i.e. not in the form of dns::Name) of hash algorithms,
have the user of the API (e.g. DNS message signer/verifier) convert
the algorithm DNS name to the generic representation. Or, if our
intent is to limit the usage of the crypto API to DNS related
purposes, it would make more sense to define the API under libdns++.

Another discussion point at the design level: with the current set of
functions it seems we cannot compute the digest incrementally; we need
to prepare an entire block of data to be signed/verified and pass it
to the sign/verify function. This would also mean, e.g., we need to
copy incoming DNS message (which could be large) to somewhere and
tweak it for sign/verify (such as adjusting the DNS header) and pass
the copy to the sign/verify function (I notice ldns works that way).
I'm afraid its overhead may not be marginal.

BIND 9's crypto interface allows incremental computation of the digest
by providing interfaces to both "update" and "final".

Considering our primary use cases of TSIG will be zone transfers and
dynamic updates, which would relatively be less performance sensitive,
the above copy overhead might be acceptable if this helps make other
part of the interface/implementation simpler and more understandable.
But I think we should at least consider pros and cons of both designs
and then pick up one.

At this point I'll give the ticket back to Jelte.

comment:9 Changed 9 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:10 in reply to: ↑ 8 ; follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Some initial feedback:

First, I made a few trivial editorial cleanups and pushed them to the
branch (commit 4f1d8d0).

Thanks, I noticed I forgot to push the last 3 commits :/ (two were also editorial things and documentation, one added python support for the new functions)

Second, it doesn't compile from the scratch because crypto_unittests
relies on some libdns++ header files (explicitly/implicitly), some of
which are generated at compile time of libdns++. (specifically,
rrclass.h)

Changed order in lib/makefile, but see also below I guess.

This particular problem could be solved in some way, but I think it's
actually one aspect of a bigger design discussion: does it make sense
to rely on DNS specific notions from the crypto (interface) API? More
specifically, does it make sense to use TSIGKey and OutputBuffer? also
as input parameters to signHMAC()/verifyHMAC()?

I was wondering if TSIGKey shouldn't be in libcrypto itself. Since it would still be dependent on Name I chose to leave it where it was.

Personally, I'd decouple independent things as much as possible. In
this specific case it seems that input/result data can simply be
opaque binary data and can be represented as (e.g.) a pair of a
pointer and length, and/or a std vector object. Representing a key is
not that trivial because we have to think about how to represent the
algorithm, but if we want to keep the crypto API separate from
libdns++, I think it makes more sense to introduce a generic
representation (i.e. not in the form of dns::Name) of hash algorithms,
have the user of the API (e.g. DNS message signer/verifier) convert
the algorithm DNS name to the generic representation. Or, if our
intent is to limit the usage of the crypto API to DNS related
purposes, it would make more sense to define the API under libdns++.

If we change the methods to not take TSIGKey, we do need to pass it nearly all information about it; it would make sense to make a class for that, which I originally had, but then I discovered we already had isc::dns::TSIGKey, and making two separate classes to represent the same object doesn't make much sense to me (and two lists of algorithms (three if you count the backend-lib-specific one too), etc).

I also considered moving TSIGKey to libcrypto, but as i said above, it would still have the link to isc::dns::Name.

Making the crypto api fully dns independent seems to me to be too abstract, unless we need it for other uses for it than dns-related things. The dns-related things we do need would then need yet another level of abstraction (and a dependency the other way around, since I expect that one to live in libdns++), and another conversion step.

If we don't need it for other uses, it might indeed make sense to move it into libdns++, since one will always depend on the other. We should then also seriously consider making it fully optional then.

Another discussion point at the design level: with the current set of
functions it seems we cannot compute the digest incrementally; we need
to prepare an entire block of data to be signed/verified and pass it
to the sign/verify function. This would also mean, e.g., we need to
copy incoming DNS message (which could be large) to somewhere and
tweak it for sign/verify (such as adjusting the DNS header) and pass
the copy to the sign/verify function (I notice ldns works that way).
I'm afraid its overhead may not be marginal.

BIND 9's crypto interface allows incremental computation of the digest
by providing interfaces to both "update" and "final".

I considered this, but didn't really see the point; the only advantage is that you can make the 'tweaks' directly to update(), but in essence you're rebuilding the Renderer again then (which is, of course, one way to go). For message size considerations I don't think it's necessary. Either way you need to prepare the wire format fully for compression, and make sure there is no 'second' rendering, if that would change the format.

comment:11 in reply to: ↑ 10 Changed 9 years ago by jinmei

Replying to jelte:

BIND 9's crypto interface allows incremental computation of the digest
by providing interfaces to both "update" and "final".

I considered this, but didn't really see the point; the only advantage is that you can make the 'tweaks' directly to update(), but in essence you're rebuilding the Renderer again then (which is, of course, one way to go). For message size considerations I don't think it's necessary. Either way you need to prepare the wire format fully for compression, and make sure there is no 'second' rendering, if that would change the format.

I'm not sure if my concern was delivered clearly, so let me show you some examples.

  1. We are going to sign a response message which is an answer to a signed query. We can get the bare wire format DNS message to be signed, e.g., as <renderer.getData(), renderer.getLength()>. If we have the incremental mode, we can incrementally compute the digest for "request MAC" (of the signed query), "the wire data" (which can be about 64KB), and "TSIG variables". On the other hand, with the current API we first have to prepare a possibly large buffer that can store all these three, and copy them into the buffer and then call signHMAC().
  1. We are verifying signed TCP messages. According to RFC2845, it is possible that we have TSIG RR in the first message in the TCP stream, followed by multiple (up to 100) messages without TSIG, and then by another message with TCP that also covers all intermediate messages. To verify the second TSIG with the current API, we need to keep all intermediate messages and then concatenate all into a single buffer and pass it to verifyHMAC().

Do you think these are okay, or are my concerns a non issue with the current API?

comment:12 Changed 9 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:13 Changed 9 years ago by jelte

  • Status changed from reviewing to accepted

No you're right, I had forgotten the intermediate messages needed to be included in the hash as well if messages are skipped. I'll keep this ticket and change it.

comment:14 Changed 9 years ago by jelte

  • Owner changed from jelte to jinmei
  • Status changed from accepted to assigned

I've added an HMAC class, which can be used with the classic 'create, update-update-update, finalize' method. I did leave in the 'old' functions, which are now convenience functions to do the above in one line, should the caller have a fixed set of data.

I changed inputs to be of void*+length for the data-to-be-signed, and for the signature-to-verify.

I did leave the sign() output argument to be of class OutputBuffer?, which is the most convenient for callers that don't want to care about the size of the output; if it turns out we want a version that can take a bare pointer and expected length, we should be able to easily add them.

I also left in the TSIGKey use for now; It kind of depends on which way we want the dependency to be (crypto dependent on dns, dns dependent on crypto, both together in one lib, or fully independent, but then we need a 'common' way to refer to algorithms, like agreed-upon strings, which seems potentially brittle)

Please have another look

comment:15 Changed 9 years ago by jinmei

Trivial things

  • I made some trivial editorial changes.
  • I also changed 'catch exception by object' to 'by const reference'. (I suggest you try cppcheck. It's very good at finding this issue)

Some higher level discussions

  • I'm okay with the design decision of making this stuff DNS related things only (but IMO it should be noted in the documentation). But in that case should it better be in libdns++? My question here is more about design choice, but there's also practical issues: right now there's a mutual dependency between libdns++ and lib(b)crypto, which would cause various troubles. For example, as soon as we have more TSIG stuff libdns++ tests would need to link lib(b)crypto. But lib(b)crypto tests also need to link libdns++.
  • Based on this decision, passing TSIGKey to hmac may be okay, but (just to be sure) is TSIG the only thing in the DNS world that requires HMAC?
  • I'm still not sure why we need to use OutputBuffer? to store results. Isn't a simple vector (or even a plain old array because we should be able to know the size) sufficient?
  • I noticed the new library is named "libbcrypto" (double b). Is the "redundant b" intentional? Perhaps you meant b as in "bind" to avoid name conflict? If so, I'd consider a better name because it would be very easy to confuse it with libcrypto (single b). how about libdnscrypto, considering the fact that it's designed to be DNS specific? But note also that the mutual dependency issue above.
  • do we need a python wrapper for the new crypt stuff?

crypto.h

  • Mostly a matter of taste, but I found doxygen's "\exception" markup is useful to describe possible exceptions thrown from a method/function.
  • The HMAC class definition: I'd see some design level discussions. e.g. this is intended to be used for DNS related hmac. I'd be also like to know if 'key' passed to the constructor can be destroyed once the constructor returns (apparently it can).
  • HMAC class: when using pimpl, copy and assignment is (normally) not trivial. we should either provide custom copy constructor/assignment operator or make the class non copyable. (I guess in the case of this class the latter is probably fine).

crypto.cc

  • at least within this .cc file I don't see the need for including iostream, iomanip, or string.
  • getHash(): can we assume Botan::get_hash() never throws a (Botan-specific) exception? Otherwise we need to catch and convert it in the HMACImpl constructor.
  • getHash(): is the overhead of this function acceptable? (pure question, I'm not sure). For sha256, we need name comparison (relatively expensive) three times, and it happens every time we need to sign/verify hmac.
  • I suspect the way of defining 'init' can cause a static initialization fiasco (of course it depends on the details of LibraryInitializer?, which I didn't dig into. this is a general concern)
  • HMACImpl constructor: this is not exception safe (hmac_ could leak).
  • HMACImpl::update/sign/verify: can we assume the underlying Boton methods don't throw?
  • HMACImpl::sign/verify: I suspect these cannot be a const member function semantically (even though hmac_ (a pointer) is intact, "*hmac_" would be modified by the underlying methods). Likewise, HMAC::sign/verify cannot be a const member function, I suspect.

crypto_unittests.cc

  • checkBuffer: in ASSERT_EQ() the expected data is the first parameter. it seems to be reversed. Also, you may want to use UnitTestUtil::matchWireData in lib/dns/tests for this type of test.
  • doHMACTestConv: it's probably better to use 'const string&' for data and key_str. key can be const. same for doHMACTestDirect() and doHMACTest().
  • (just a comment) when you don't worry about extending the buffer (and possible failure of it) you can simply pass 0 to the OutputBuffer? constructor.
  • UnsupportedAlgorithm? case doesn't seem to be tested.
  • BadKey? test is probably better to be done for HMAC constructor because it's closer to the point where the exception is thrown.

tsigkey.h

  • the new TSIGKey constructor (doc): it doesn't return anything.
  • I'm not sure if the :-separated notation is a good one. What if <name> contains ':'? wouldn't it be better (and easy to implement) to have it three string parameters? (and allow an empty string for algorithm). BTW, it should describe which algorithm is used when it's omitted.

tsigkey.cc
(I've not yet closely looked at the string parser due to the question
about the interface, see above).

  • in the 'from string' TSIGKey constructor, variable length array is not very portable. in fact, in this case this part of logic can just be simpler:
            vector<uint8_t> secret;
            decodeBase64(secret_str, secret);
    	// (we don't need secret_b)
    ...
            impl_ = new TSIGKeyImpl(key_name, algo_name, &secret[0]
                                    secret.size());
    
  • in that constructor, is it a good idea to catch all isc exceptions? can't we simply let various exceptions be propagated?
  • in toText(), I'd simplify the vector initialization:
        const vector<uint8_t> secret_v(static_cast<const uint8_t*>(getSecret()),
                                       static_cast<const uint8_t*>(getSecret()) +
                                       getSecretLength());
        // no for loop is necessary
    

tsigkey_python_test.py

exception cases are missing?

comment:16 Changed 9 years ago by jinmei

  • Owner changed from jinmei to jelte
  • Status changed from assigned to reviewing

comment:17 Changed 9 years ago by jinmei

ah, and one last thing: you may want to have a changelog entry.

comment:18 follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Trivial things

  • I made some trivial editorial changes.
  • I also changed 'catch exception by object' to 'by const reference'. (I suggest you try cppcheck. It's very good at finding this issue)

thanks :) (i don't really use cppcheck much yet since it is a tad noisy here, we should probably put in a bit of effort to make it clean and keep it that way)

Some higher level discussions

  • I'm okay with the design decision of making this stuff DNS related things only (but IMO it should be noted in the documentation). But in that case should it better be in libdns++? My question here is more about design choice, but there's also practical issues: right now there's a mutual dependency between libdns++ and lib(b)crypto, which would cause various troubles. For example, as soon as we have more TSIG stuff libdns++ tests would need to link lib(b)crypto. But lib(b)crypto tests also need to link libdns++.

Right. Originally I did not intend to make libdns make direct use of libcrypto, so having a dependency the other way wouldn't be a problem. Right now we have no plans to use cryptographic functionality outside of libdns++ (as far as i'm aware), but there is quite a big chance this may change in the future. Another question is how 'clean' we want libdns++ to be regarding non-dns specific things (like crypto :)).

Since using this directly from libdns++ seems to be the direction, I've removed the TSIGKey reference. In this branch, it is still dependent on libdns because of the outputbuffer (even though i added other sign options, see below), but outputbuffer is about to be moved out of libdns anyway, when it has been, there will no dependency on libdns++ anymore.

  • Based on this decision, passing TSIGKey to hmac may be okay, but (just to be sure) is TSIG the only thing in the DNS world that requires HMAC?

AFAIK it is. We may want it in the future for other things, for instance to secure command channels. But this is moot now :)

  • I'm still not sure why we need to use OutputBuffer? to store results. Isn't a simple vector (or even a plain old array because we should be able to know the size) sufficient?

We don't *need* it, but i figured outputbuffer would me the most logical type to have for this, which would save the caller from another checkoutputlength/alloc set of calls. A vector would do that as well, but will cause yet another copy afterwards. This was under the assumption that the caller would have an outputbuffer already anyway, so as i noted in my previous comment, it should be pretty easy to change or add sign() calls with the needed types.

Anyhow, I added both for flexibility. I also added a size_t len argument for all three, so we can support truncated signatures.

  • I noticed the new library is named "libbcrypto" (double b). Is the "redundant b" intentional? Perhaps you meant b as in "bind" to avoid name conflict? If so, I'd consider a better name because it would be very easy to confuse it with libcrypto (single b). how about libdnscrypto, considering the fact that it's designed to be DNS specific? But note also that the mutual dependency issue above.

It was intentional, what about libb10crypto? (to make it more clear that it is not a typo :), and leaves the option open to use it for non-dns things)

Changed it to that now.

  • do we need a python wrapper for the new crypt stuff?

Probably, but it seemed like a good idea to agree on the API first.

crypto.h

  • Mostly a matter of taste, but I found doxygen's "\exception" markup is useful to describe possible exceptions thrown from a method/function.

Ack, changed.

  • The HMAC class definition: I'd see some design level discussions. e.g. this is intended to be used for DNS related hmac. I'd be also like to know if 'key' passed to the constructor can be destroyed once the constructor returns (apparently it can).

Ack, but also moot now

  • HMAC class: when using pimpl, copy and assignment is (normally) not trivial. we should either provide custom copy constructor/assignment operator or make the class non copyable. (I guess in the case of this class the latter is probably fine).

ack

crypto.cc

  • at least within this .cc file I don't see the need for including iostream, iomanip, or string.

ack, removed

  • getHash(): can we assume Botan::get_hash() never throws a (Botan-specific) exception? Otherwise we need to catch and convert it in the HMACImpl constructor.

With the use as it was, yes, but when we add other algorithms it might throw an AlgorithmNotFound?, so i've added a catch for that.

  • getHash(): is the overhead of this function acceptable? (pure question, I'm not sure). For sha256, we need name comparison (relatively expensive) three times, and it happens every time we need to sign/verify hmac.

Well, this comparison has to occur somewhere when going from a TSIGKey to an underlying algorithm. Since for the TSIGKey removal I introduced an enum for hmac hash algorithms, this is no longer a problem of this specific lib :)

  • I suspect the way of defining 'init' can cause a static initialization fiasco (of course it depends on the details of LibraryInitializer?, which I didn't dig into. this is a general concern)

I 'copied' the approach from botan; i.e. there is a RAII object for this now, and the problem is passed off to the program using this lib. We need something like this in the future anyway for initialization options.

  • HMACImpl constructor: this is not exception safe (hmac_ could leak).

added delete

  • HMACImpl::update/sign/verify: can we assume the underlying Boton methods don't throw?

only invalidkeylength, afaict, should we add 'general' Botan::Exception catches?

  • HMACImpl::sign/verify: I suspect these cannot be a const member function semantically (even though hmac_ (a pointer) is intact, "*hmac_" would be modified by the underlying methods). Likewise, HMAC::sign/verify cannot be a const member function, I suspect.

removed

crypto_unittests.cc

  • checkBuffer: in ASSERT_EQ() the expected data is the first parameter. it seems to be reversed. Also, you may want to use UnitTestUtil::matchWireData in lib/dns/tests for this type of test.

That would introduce a build-order-dependency conflict again :)

  • doHMACTestConv: it's probably better to use 'const string&' for data and key_str. key can be const. same for doHMACTestDirect() and doHMACTest().

changed

The old API made it impossible to get to that state, since it was not possible to create a TSIGKey with an unknown algorithm. But since we now pass an identifier (and 'unknown' is a possible value), i added a test now.

  • BadKey? test is probably better to be done for HMAC constructor because it's closer to the point where the exception is thrown.

added

tsigkey.h

  • the new TSIGKey constructor (doc): it doesn't return anything.

removed line

  • I'm not sure if the :-separated notation is a good one. What if <name> contains ':'? wouldn't it be better (and easy to implement) to have it three string parameters? (and allow an empty string for algorithm). BTW, it should describe which algorithm is used when it's omitted.

we already have the three-part constructor (the original one with Name args). This is simply to make it possible to use the de facto standard for the string representation, as introduced by dig and copied by multiple other implementations. If the name contains a ':', it will simply fail to create a key using that representation. We could add logic for escapes etc. But we could also simply say not to use names with :.

tsigkey.cc

  • in the 'from string' TSIGKey constructor, variable length array is not very portable. in fact, in this case this part of logic can just be simpler:

ack, updated

  • in that constructor, is it a good idea to catch all isc exceptions?

No, but it's better than enumerating them all and catching them one by one (and almost certainly missing one or two). I think we should have a hierarchy of exceptions actually (like nameparseerror that is a superclass for emptylabl, toolonglabel, etc.etc.). Probably something to discuss and/or make a task for.

can't we simply let various exceptions be propagated?

Then the caller would have the same problem. This way I can honestly say that the \exception list in the docs are complete :) (and, more importantly, that the caller does not have to take 15 different exceptions into account, which may or may not change, not to mention the fact that the documentation will probably not be updated if it does indeed change).

  • in toText(), I'd simplify the vector initialization:

ack, done

tsigkey_python_test.py

exception cases are missing?

ack, added

proposed changelog entry:
[func] Introduced an API for cryptographic operations. Currently it only supports HMAC, intended for use with TSIG. The current implementation uses Botan as the backend library.

comment:19 in reply to: ↑ 18 Changed 9 years ago by jinmei

Replying to jelte:

Trivial things

  • I made some trivial editorial changes.
  • I also changed 'catch exception by object' to 'by const reference'. (I suggest you try cppcheck. It's very good at finding this issue)

thanks :) (i don't really use cppcheck much yet since it is a tad noisy here, we should probably put in a bit of effort to make it clean and keep it that way)

FYI on cppcheck: Older versions of cppcheck seem to complain about
more things (in some sense it's good, but it also means relatively
more false positives). I'm using 1.47, and it should run cleanly (via
'make cppcheck' at the top_dir). We even run it on a buildbot.

BTW: I've added this level of these fixes to the revised code, too.

Some higher level discussions

  • I'm okay with the design decision of making this stuff DNS related things only (but IMO it should be noted in the documentation). But in that case should it better be in libdns++? [...]

Right. Originally I did not intend to make libdns make direct use of libcrypto, so having a dependency the other way wouldn't be a problem. Right now we have no plans to use cryptographic functionality outside of libdns++ (as far as i'm aware), but there is quite a big chance this may change in the future. Another question is how 'clean' we want libdns++ to be regarding non-dns specific things (like crypto :)).

Now that the mutual dependency problem is going to be solved, I'm okay
with separating these.

  • I'm still not sure why we need to use OutputBuffer? to store results. Isn't a simple vector (or even a plain old array because we should be able to know the size) sufficient?

We don't *need* it, but i figured outputbuffer would me the most logical type to have for this, which would save the caller from another checkoutputlength/alloc set of calls. A vector would do that as well, but will cause yet another copy afterwards. This was under the assumption that the caller would have an outputbuffer already anyway, so as i noted in my previous comment, it should be pretty easy to change or add sign() calls with the needed types.

Anyhow, I added both for flexibility. I also added a size_t len argument for all three, so we can support truncated signatures.

Hmm, maybe you're thiking about updating a buffer to construct a
complete DNS message with a TSIG digest "on the fly"? That is,
conceptually like this?

   message.renderHeader(buffer);
   message.renderAnswer(buffer);
   message.renderAuthority(buffer);
   message.renderAdditional(buffer); // without TSIG
   message.renderTSIGParameters(buffer); // render params before digest
   signHMAC(buffer.getData(), buffer.getLength() - sizeof(TSIG), buffer);

If so, this is actually different from what I'd do for signing. I
suspect, due to various corner cases that would require the
incremental (init-)update-final steps, we'd normally end up getting
the digest in a separate space and copying it to the buffer/renderer.

But anyway, we cannot just be so sure how it would actually be used
yet, so I don't necessarily request this interface to be removed at
this stage.

  • I noticed the new library is named "libbcrypto" (double b). [...]

It was intentional, what about libb10crypto? (to make it more clear

that it is not a typo :), and leaves the option open to use it
for non-dns things)

Changed it to that now.

Frankly, libb10crypto sounds a bit awkward:-) but I must confess I'm
not good at naming things. I just come up with a generic name
"libcryptolink", but since at least b10crypto is much less confusing
I'm okay with it, too if you like it better.

  • do we need a python wrapper for the new crypt stuff?

Probably, but it seemed like a good idea to agree on the API first.

Fair enough.

  • getHash(): is the overhead of this function acceptable? (pure question, I'm not sure). For sha256, we need name comparison (relatively expensive) three times, and it happens every time we need to sign/verify hmac.

Well, this comparison has to occur somewhere when going from a TSIGKey to an underlying algorithm. Since for the TSIGKey removal I introduced an enum for hmac hash algorithms, this is no longer a problem of this specific lib :)

Okay, we may want to convert string/name to the crypto definition at
the time of TSIGKey construction, but that's another issue.

  • I suspect the way of defining 'init' can cause a static initialization fiasco (of course it depends on the details of LibraryInitializer?, which I didn't dig into. this is a general concern)

I 'copied' the approach from botan; i.e. there is a RAII object for

this now, and the problem is passed off to the program using
this lib. We need something like this in the future anyway for
initialization options.

Okay, this is much safer, but the new approach seems to have another
issue: a buggy program can instantiate it multiple times (not sure if
it causes a bad thing, though). Also, do we really expect it to be
destructed at the end of the program? If not we don't even need the
RAII trick; we can simply provide an initialization function, say,
initCrypto(), and have the main program call it (this approach doesn't
solve the duplicate initialization problem, though).

Beside, requiring programs to do something specific (either
instantiate an RAII object or call an init function) is inconvenient;
I guess you were aware of that and that's why you hid it in crypto.cc
in the first implementation. IMO safety is more important than
convenience in general, so if the policy is to accept this
inconvenience, I'm okay with that. But if we want to avoid that while
avoiding the initialization fiasco, one approach would be to hide the
initialization logic in the wrapper classes such as HMACImpl. This
approach has a drawback that we must not forget doing it in every
wrapper class, so this is basically a design tradeoff. I'm okay with
any safer approach.

  • HMACImpl constructor: this is not exception safe (hmac_ could leak).

added delete

Ack, but to be super exception safe, I'd rather (e.g.) use
boost::scoped_ptr. I don't know the details of the Botan internals,
but, for example, if it internally performs (throwing version of) new
and doesn't convert the standard exceptions, the revised code is still
not exception safe. Using RAII like scoped_ptr will ensure the
safeness regardless of the Botan internals. (And if we use scoped_ptr
we don't need an explicit destructor)

  • HMACImpl::update/sign/verify: can we assume the underlying Boton methods don't throw?

only invalidkeylength, afaict, should we add 'general'
Botan::Exception catches?

It might depend on the internal details of Botan, but as long as we
want to prevent any Botan specific exceptions from being propagated,
I think it would be safer to catch larger exceptions. That way we can
ensure that even if the Botan implementation changes in its future
versions.

crypto_unittests.cc

  • checkBuffer: in ASSERT_EQ() the expected data is the first parameter. it seems to be reversed. Also, you may want to use UnitTestUtil::matchWireData in lib/dns/tests for this type of test.

That would introduce a build-order-dependency conflict again :)

Yeah I know. So for now it probably doesn't make sense to use
matchWireData. In future I think we should extract non DNS specific
test utilities into a shared place so that it can be used by other
modules.

  • I'm not sure if the :-separated notation is a good one. [...]

we already have the three-part constructor (the original one with Name args). This is simply to make it possible to use the de facto standard for the string representation, as introduced by dig and copied by multiple other implementations. If the name contains a ':', it will simply fail to create a key using that representation. We could add logic for escapes etc. But we could also simply say not to use names with :.

Okay. I have more comments about the implementation, though. See below.

  • in that constructor, is it a good idea to catch all isc exceptions?

No, but it's better than enumerating them all and catching them one by one (and almost certainly missing one or two). I think we should have a hierarchy of exceptions actually (like nameparseerror that is a superclass for emptylabl, toolonglabel, etc.etc.). Probably something to discuss and/or make a task for.

can't we simply let various exceptions be propagated?

Then the caller would have the same problem. This way I can honestly say that the \exception list in the docs are complete :) (and, more importantly, that the caller does not have to take 15 different exceptions into account, which may or may not change, not to mention the fact that the documentation will probably not be updated if it does indeed change).

For these points, I basically envision that higher level applications
catch exceptions at the level of isc::Exception (or even
std::exception, from which all isc::Exception classes are derived),
just like b10-auth does (in which case it returns SERVFAIL). The
above questions of mine are based on this observation. But I admit
this is probably a debatable point, and so I'm okay with keeping the
current code and defer the discussion to a separate ticket or dev list
thread.

proposed changelog entry:

[func] Introduced an API for cryptographic operations. Currently it only supports HMAC, intended for use with TSIG. The current implementation uses Botan as the backend library.

I'd note that build now requests Botan. I'd also mark this entry with
'*' due to the additional build time dependency.

Now some more code level comments on the revised one:

crypto.cc

  • HMACImpl::verify: is the "len > getOutputLength()" case tested? (just checking)
  • HMACImpl constructor now doesn't have to be "explicit" (because it takes more than one parameters). but I don't mind if you intentionally left it so that e.g. we don't forget to re-add it if and when we re-revise it to a single-parameter constructor. same for the HMAC constructor.
  • HMAC::HashAlgorithm? I'd add doxygen description for the enums (even though the meaning is pretty clear from the symbol). also, I'd like to see clarification of what "UNKNOWN" means (e.g. whether it's just a placeholder and always trigger some error, or if it can be used in some valid scenario)

crypto/tests

  • in checkData(), expected/actual parameters to ASSERT_EQ() still seem to be reversed (maybe it's helpful to name them "expected_data", "actual_data" instead of "data1"/"data2")
  • doHMACTestArray() uses a variable length array, which is (unfortunately) not very portable. considering this is a test, maybe we can just allocate a memory dynamically, noting it's not exception safe as a comment (we could also use std::vector/boost::scoped_array)
  • about secret/expected data (I should have brought this up in the first review, sorry for the raising a new topic): the arrays can be const uint8_t[]. as for std::string data, shouldn't it actually better be vector<uint8_t>? In either case, the initialization using the for loop could be simplified:
        const std::string data3(50, 0xdd);
        // std::string data3;
        // for (int i = 0; i < 50; ++i) {
        //    data3.push_back(0xdd);
        //}
    
    (as a bonus we can also make the variable const this way)

tsigkey test

  • I'd use a different name than "TSIGTest" (for TSIGKeyFromToString) because I'd like to use it for TSIG sign/verify (I hope it's fair to say "TSIGTest" is more suitable for these kinds of tests:-). google test doesn't seem to allow name conflict at this level.

tsigkey.cc

  • TSIGKey from string constructor: (I should have pointed this out in the first review) you may want to make it an 'explicit' constructor.
  • TSIGKey from string constructor: this seems to be incorrect:
            } else {
                pos2 = str.size() - pos;
            }
    
    You can realize the bug by making the len(key name) <= len(secret):
        EXPECT_EQ("example.:MSG6Ng==:hmac-md5.sig-alg.reg.int.",
                  TSIGKey("example.:MSG6Ng==").toText());
    
    We could fix it in a straightforward way, but I actually wonder whether we could achieve what we want using a higher level interface, rather than via lower level of "position + length" APIs, which are generally error prone and difficult to understand. So, for example, how about the attached diff? This one is not that concise compared to the original code, but the logic should be more intuitive and much safer.
  • Regarding the previous point, if you choose fixing the original code, I suggest deferring the definition of "secret_str" just before it's needed.
  • And, in either case, it would be good if we extract the alg_name check used in the two constructors.

Changed 9 years ago by jinmei

comment:20 Changed 9 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:21 Changed 9 years ago by jinmei

Oh, and one more thing (as usual:-)

It's probably better to use private inheritance for boost::noncopyable:

class HMAC : public boost::noncopyable {

comment:22 follow-up: Changed 9 years ago by jelte

  • Defect Severity set to N/A
  • Owner changed from jelte to jinmei
  • Sub-Project set to DNS

Replying to jinmei:

<snip discussion on usage>

But anyway, we cannot just be so sure how it would actually be used
yet, so I don't necessarily request this interface to be removed at
this stage.

ok, let's keep that in mind for when we finalize this

Frankly, libb10crypto sounds a bit awkward:-) but I must confess I'm
not good at naming things. I just come up with a generic name
"libcryptolink", but since at least b10crypto is much less confusing
I'm okay with it, too if you like it better.

Yeah... cryptolink is fine actually. For consistency, I've renamed not only the library file, but also the directory, namespace, main filename and main class name. So you may need to do a bit of rebuilding :p

Oh i've also moved HMAC stuff into crypto_hmac.[cc|h].

Okay, we may want to convert string/name to the crypto definition at
the time of TSIGKey construction, but that's another issue.

Yes, we could change TSIGKey to not have a Name member, but a HashAlgorithm?, and make the getter perform a switch and return the static Name for the specific algorithms.

<snip discussion on initialization and raii>

Okay, so here's the other Big change; We now provide a singleton factory object, that is automatically initialized on the first call to CryptoLink::getCryptoLink() (which is the only call that will give you access to anything), but can be initialized manually beforehand (which isn't necessary right now but will be later if we want to pass initialization arguments). This object is the only entry point for all the classes in the cryptolink namespace. Right now that is only HMAC. So it contains a createHMAC object (which returns a raw pointer, because HMAC itself is noncopyable, i've added documentation that suggests to place it into a scoped_ptr).

Another 'protection' mechanism is that HMAC itself is not directly constructable; it's constructor is private. Only CryptoLink? can create them now (i've made it a friend). And the intention is to do this for all classes we make that wrap backend library objects (i.e. all should be nonconstructible, and probably noncopyable as well, and have a factory function in CryptoLink? if they must be created by the caller).

  • HMACImpl constructor: this is not exception safe (hmac_ could leak).

added delete

Ack, but to be super exception safe, I'd rather (e.g.) use
boost::scoped_ptr. I don't know the details of the Botan internals,
but, for example, if it internally performs (throwing version of) new
and doesn't convert the standard exceptions, the revised code is still
not exception safe. Using RAII like scoped_ptr will ensure the
safeness regardless of the Botan internals. (And if we use scoped_ptr
we don't need an explicit destructor)

fair enough, done.

  • HMACImpl::update/sign/verify: can we assume the underlying Boton methods don't throw?

only invalidkeylength, afaict, should we add 'general'
Botan::Exception catches?

It might depend on the internal details of Botan, but as long as we
want to prevent any Botan specific exceptions from being propagated,
I think it would be safer to catch larger exceptions. That way we can
ensure that even if the Botan implementation changes in its future
versions.

I've added one more excpetion class, LibraryError?, which is to be thrown if anything 'unexpected' happens; in this case whenever we see Botan::Exception being raised, and I've added a catch for Botan::Exception around all calls to botan.

That would introduce a build-order-dependency conflict again :)

Yeah I know. So for now it probably doesn't make sense to use
matchWireData. In future I think we should extract non DNS specific
test utilities into a shared place so that it can be used by other
modules.

+1

Now some more code level comments on the revised one:

crypto.cc

  • HMACImpl::verify: is the "len > getOutputLength()" case tested? (just checking)

they are now

  • HMAC::HashAlgorithm? I'd add doxygen description for the enums (even though the meaning is pretty clear from the symbol). also, I'd like to see clarification of what "UNKNOWN" means (e.g. whether it's just a placeholder and always trigger some error, or if it can be used in some valid scenario)

ack. There is, added comments.

crypto/tests

  • in checkData(), expected/actual parameters to ASSERT_EQ() still seem to be reversed (maybe it's helpful to name them "expected_data", "actual_data" instead of "data1"/"data2")

good point, done :)

  • doHMACTestArray() uses a variable length array, which is (unfortunately) not very portable. considering this is a test, maybe we can just allocate a memory dynamically, noting it's not exception safe as a comment (we could also use std::vector/boost::scoped_array)

While i think tests should be clean, I happen not to care too much about exception safety (is there is any unexpected exception, there is something to be fixed in the test or the code anyway :)), so that should be ok here.

  • about secret/expected data (I should have brought this up in the first review, sorry for the raising a new topic): the arrays can be const uint8_t[]. as for std::string data, shouldn't it actually better be vector<uint8_t>? In either case, the initialization using the for loop could be simplified:
        const std::string data3(50, 0xdd);
        // std::string data3;
        // for (int i = 0; i < 50; ++i) {
        //    data3.push_back(0xdd);
        //}
    
    (as a bonus we can also make the variable const this way)

wow i did not even know that existed. done.

tsigkey test

  • I'd use a different name than "TSIGTest" (for TSIGKeyFromToString) because I'd like to use it for TSIG sign/verify (I hope it's fair to say "TSIGTest" is more suitable for these kinds of tests:-). google test doesn't seem to allow name conflict at this level.

oh sure, made it TSIGStringTest

tsigkey.cc

  • TSIGKey from string constructor: (I should have pointed this out in the first review) you may want to make it an 'explicit' constructor.

ok

  • TSIGKey from string constructor: this seems to be incorrect:
            } else {
                pos2 = str.size() - pos;
            }
    
    You can realize the bug by making the len(key name) <= len(secret):
        EXPECT_EQ("example.:MSG6Ng==:hmac-md5.sig-alg.reg.int.",
                  TSIGKey("example.:MSG6Ng==").toText());
    
    We could fix it in a straightforward way, but I actually wonder whether we could achieve what we want using a higher level interface, rather than via lower level of "position + length" APIs, which are generally error prone and difficult to understand. So, for example, how about the attached diff? This one is not that concise compared to the original code, but the logic should be more intuitive and much safer.

that patch was from before you did the const changes, so it didn't apply :) But it's merged.

  • And, in either case, it would be good if we extract the alg_name check used in the two constructors.

right, done

A lot has changed again, sorry for that :)

(but i did run cppcheck this time ;))

comment:23 in reply to: ↑ 22 ; follow-up: Changed 9 years ago by jinmei

Replying to jelte:

Okay, so here's the other Big change; We now provide a singleton factory object, that is automatically initialized on the first call to CryptoLink::getCryptoLink() (which is the only call that will give you access to anything), but can be initialized manually beforehand (which isn't necessary right now but will be later if we want to pass initialization arguments). This object is the only entry point for all the classes in the cryptolink namespace. Right now that is only HMAC. So it contains a createHMAC object (which returns a raw pointer, because HMAC itself is noncopyable, i've added documentation that suggests to place it into a scoped_ptr).

I have a concern about the approach of returning a bare pointer and
moving the cleanup responsibility to the caller. In general, it's
fragile if the module that performs 'new' is different from the one
performing 'delete', because these two implementations may be
different and incompatible. In the case of createHMAC() (or future
similar methods) this seems to be a real concern because it looks like
we expect dynamically loadable modules to use this API (and some of
which could be built separately from BIND 10).

I myself am not so comfortable with relying on smart pointers too much
(and I chose to return a bare pointer in a factory function of
edns.cc), but I suspect in this case we need to return a smart pointer
instead of the bare one.

Another 'protection' mechanism is that HMAC itself is not directly constructable; it's constructor is private. Only CryptoLink? can create them now (i've made it a friend). And the intention is to do this for all classes we make that wrap backend library objects (i.e. all should be nonconstructible, and probably noncopyable as well, and have a factory function in CryptoLink? if they must be created by the caller).

Using friend always makes me nervous, due to its excessive power:-)
But this is probably one of the rare cases where the use of it is
justified in that it would be far better than leaving the room of an
HMAC object being constructed without initialization accidentally and
in that we can centralize the necessary initialization. But please
add some documentation about such consideration, maybe to the general
description of the CryptoLink? class.

Other parts of the code mostly looks okay with some minor points.
I've directly pushed changes for trivial and editorial things.
Other points follow:

crypto_unittest

  • 'will leak' sounds a bit too strong because it will normally be

freed. s/will/can/?

        // note: this is not exception-safe, and will leak, but
  • now looking at the revised code with simplified std::string, I also noticed we could even omit some of the variables. e.g. instead of
        const std::string data4(50, 0xcd);
    ...
        doHMACTest(data4, secret4, 25, HMAC::MD5, hmac_expected4, 16);
    
    we can say
        doHMACTest(std::string(50, 0xcd), secret4, 25, HMAC::MD5, hmac_expected4, 16);
    
    (sorry for not noticing this in the previous review, but I guess this is how experimental refactoring works...) But if you think the separate variable makes the code more readable, I'm okay with that (I'd probably object if it were a non const variable, but with a const variable it's mostly a matter of preference)

tsigkey.cc

  • the proposed patch being adopted, it's probably safer to include <sstream> explicitly.

CryptoLink? and HMAC classes

  • private inheritance is the default and can be omitted, although I guess you might intentionally want to be explicit and in that case I'm okay with that.

(new) crypto.cc/crypto.h

  • brief description is missing for CryptoLink?:
    class CryptoLinkImpl;
    
    /// \brief 
    
  • many of the include files now don't seem to be necessary: most, if not all DNS related ones, and maybe some botan and std ones

crypto_unittests

  • This code leaks hmac:
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::MD5), BadKey);
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::UNKNOWN), UnsupportedAlgorithm);
    
    (but see also the bigger issue above)

comment:24 Changed 9 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:25 Changed 9 years ago by jinmei

Sorry, usual one another thing:

I noticed the string format for the TSIG key specification is not
compatible with the -y option of BIND 9's dig. dig's format is
"[hmac:]name:key", while ours is "<name>:<secret>[:<algorithm>]".
If the purpose is to follow the "de facto standard", and the dig
format is that de facto, it would be better to be compatible with that.

comment:26 in reply to: ↑ 23 ; follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Replying to jelte:

Okay, so here's the other Big change; We now provide a singleton factory object, that is automatically initialized on the first call to CryptoLink::getCryptoLink() (which is the only call that will give you access to anything), but can be initialized manually beforehand (which isn't necessary right now but will be later if we want to pass initialization arguments). This object is the only entry point for all the classes in the cryptolink namespace. Right now that is only HMAC. So it contains a createHMAC object (which returns a raw pointer, because HMAC itself is noncopyable, i've added documentation that suggests to place it into a scoped_ptr).

I have a concern about the approach of returning a bare pointer and
moving the cleanup responsibility to the caller. In general, it's
fragile if the module that performs 'new' is different from the one
performing 'delete', because these two implementations may be
different and incompatible. In the case of createHMAC() (or future
similar methods) this seems to be a real concern because it looks like
we expect dynamically loadable modules to use this API (and some of
which could be built separately from BIND 10).

I myself am not so comfortable with relying on smart pointers too much
(and I chose to return a bare pointer in a factory function of
edns.cc), but I suspect in this case we need to return a smart pointer
instead of the bare one.

deferring this for now (as per discussion on jabber)

Another 'protection' mechanism is that HMAC itself is not directly constructable; it's constructor is private. Only CryptoLink? can create them now (i've made it a friend). And the intention is to do this for all classes we make that wrap backend library objects (i.e. all should be nonconstructible, and probably noncopyable as well, and have a factory function in CryptoLink? if they must be created by the caller).

Using friend always makes me nervous, due to its excessive power:-)
But this is probably one of the rare cases where the use of it is
justified in that it would be far better than leaving the room of an
HMAC object being constructed without initialization accidentally and
in that we can centralize the necessary initialization. But please
add some documentation about such consideration, maybe to the general
description of the CryptoLink? class.

ok. I must confess I was a bit surprised to see an actual use-case for friend myself :)

BTW, there is supposed to be a way to only provide friend access to specific functions, but I haven't figured out how to use that the way i want yet (like, ideally, we would only give createHMAC access to the constructor, and to the constructor only). Perhaps I misunderstood that this is possible.

Other parts of the code mostly looks okay with some minor points.
I've directly pushed changes for trivial and editorial things.
Other points follow:

crypto_unittest

  • 'will leak' sounds a bit too strong because it will normally be

freed. s/will/can/?

sure, why not :)

  • now looking at the revised code with simplified std::string, I also noticed we could even omit some of the variables. e.g. instead of
        const std::string data4(50, 0xcd);
    ...
        doHMACTest(data4, secret4, 25, HMAC::MD5, hmac_expected4, 16);
    
    we can say
        doHMACTest(std::string(50, 0xcd), secret4, 25, HMAC::MD5, hmac_expected4, 16);
    
    (sorry for not noticing this in the previous review, but I guess this is how experimental refactoring works...) But if you think the separate variable makes the code more readable, I'm okay with that (I'd probably object if it were a non const variable, but with a const variable it's mostly a matter of preference)

why not, done :) (come to think of it, we could also do the same 'trick' for all arrays that have repeated data, but since we have those in there now anyway i think we can leave them)

tsigkey.cc

  • the proposed patch being adopted, it's probably safer to include <sstream> explicitly.

ack

CryptoLink? and HMAC classes

  • private inheritance is the default and can be omitted, although I guess you might intentionally want to be explicit and in that case I'm okay with that.

i prefer to have it in there explicitely

(new) crypto.cc/crypto.h

  • brief description is missing for CryptoLink?:
    class CryptoLinkImpl;
    
    /// \brief 
    
  • many of the include files now don't seem to be necessary: most, if not all DNS related ones, and maybe some botan and std ones

oops, I had done this, but had forgotten to push it, sorry for that. Once i push now, you'll find it before your editorial changes commit :)

another change that i then forgot to push as well is the renaming of the main files; crypto.cc -> cryptolink.cc and crypto.h -> cryptolink.h (for consistency reasons given the class it defines)

crypto_unittests

  • This code leaks hmac:
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::MD5), BadKey);
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::UNKNOWN), UnsupportedAlgorithm);
    
    (but see also the bigger issue above)

if that is so, it wouldn't be the code here that causes the leaks, since createHMAC should itself not end up with any allocated data if there is an exception inside it (you wouldn't be able to delete it, you'd never get the pointer).

comment:27 in reply to: ↑ 26 ; follow-up: Changed 8 years ago by jinmei

Replying to jelte:

deferring this for now (as per discussion on jabber)

Okay (I plan to explain my point in more detail in a separate comment).

ok. I must confess I was a bit surprised to see an actual use-case for friend myself :)

BTW, there is supposed to be a way to only provide friend access to specific functions, but I haven't figured out how to use that the way i want yet (like, ideally, we would only give createHMAC access to the constructor, and to the constructor only). Perhaps I misunderstood that this is possible.

Hmm, "a member function of one class can be the firend of another"
(by the Stroustrup book), so it should be possible to say like this:

    //friend class CryptoLink;
    friend HMAC* CryptoLink::createHMAC(const void*, size_t,
                                        const HashAlgorithm);

but when I tried to confirm this I found a larger issue: cryptolink.h
includes crypto_hmac.h, and vice versa. Such mutual-inclusion is not
necessarily invalid, but can easily make the design fragile in
general, and in this particular case, actually makes the above friend
declaration non-compilable (the friend declaration requires the
definition of CryptoLink?, which is in cryptolink.h, which includes
crypto_hmac.h, which defines HMAC with the friend declaration...)

Looking at it more closely, and based on the observation that hash
algorithms are not specific to HMAC (for example, we may want to
replace our homemade sha1 implementation using this framework, or we
may want to integrate it to this framework), I'd suggest moving
HashAlgorithm? to hmaclink.h (or a separate crypto_hash.h or
something). Then, in cryptolink.h, we can only put a forward
declaration of 'class HMAC' and avoid including crypto_hmac.h.

Whether or not we use function-based friend (on which I don't have a
strong opinion) I think this is a good change.

And (yet) another point I noticed while I was looking at the code: it
looks possible to change the return value of
CryptoLink::getCryptoLink() to 'const CryptoLink?&' (and then we have
to change CryptoLink::createHMAC() a const member function).

Other changes look okay. One follow up comment:

crypto_unittests

  • This code leaks hmac:
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::MD5), BadKey);
        EXPECT_THROW(crypto.createHMAC(NULL, 0, HMAC::UNKNOWN), UnsupportedAlgorithm);
    
    (but see also the bigger issue above)

if that is so, it wouldn't be the code here that causes the leaks, since createHMAC should itself not end up with any allocated data if there is an exception inside it (you wouldn't be able to delete it, you'd never get the pointer).

Ah, okay, you're right. (Though, as you originally said 'will leak'
(in error cases), if createHMAC() doesn't throw as expected the
allocated memory will leak. But I'm okay with ignoring this scenario)

comment:28 Changed 8 years ago by jinmei

Now, about the new/delete issue.

Apparently we've been talking about slightly different things, so Let
me use an absurd example to highlight my point.

Consider a library that returns a bare pointer to class Foo object
(after new'ing the object):

// in the library header file:
class Foo; // forward declaration
Foo* createFoo();

// in the library implementation file:
Foo*
createFoo() {
    return (new Foo());
}

Also consider an external program (an executable that uses the library
or a dynamically loadable third party module or something) that does:

void
operator delete(void* ptr) {
    // do some fancy stuff, e.g., assuming a trailing padding at the
    // end of the allocated memory to check buffer overrun, and abort
    // the program if the assumption isn't met.
}

void
someFunction() {
   Foo* foo = createFoo();

   // play with foo for a while

   delete foo;
}

Then, at the point of 'delete foo', the externally defined operator
delete will be called, and an unexpected result will happen.

Of course, this is absurd (the program should have defined operator
new, too if it wants to customize delete), but if the external program
is built independently from the library (by a different
(version/vendor of) compiler, using a different (version/vendor) of
standard C++ library, or a different setting for the same compiler
(debug/release mode, etc)), essentially the same thing can happen in a
more implicit way. This is the point of my concern.

To avoid this scenario, we can provide a separate deleter function:

// in the library header
void deleteFoo(Foo* ptr_to_foo);

// in the library implementation file:
void
deleteFoo(Foo* ptr_to_foo) {
    delete ptr_to_foo;
}

And have the application call deleteFoo() instead of 'delete'ing the
bare pointer directly (we could even force it if we want, by defining
a deleter object and making the Foo destructor private, but it's
probably too much). In addition, we can combine it with
boost::shared_ptr so that the caller doesn't have to care about
releasing the resource:

typedef boost::shared_ptr<Foo> FooPtr;

FooPtr
createFoo() {
   return (FooPtr(createFoo(), deleteFoo));
}

Or, if we like to give the caller more flexibility, we could give them
a bare pointer, suggesting the use of a smart pointer this way at
their side.

Such a problem is not specific to the cryptolink API; it applies to
any relationship between a third party application that uses BIND 10
libraries. But since I heard this API will be used with an external
loadable module, and I thought it might be developed and built
separately from BIND 10, I thought the risk was higher in this case,
and am suggesting a safer approach. In this case, the API is not
supposed to be very fast (because the underlying crypto stuff will be
quite heavy anyway), so the overhead of the indirection wouldn't be an
issue. Also, since the deleter function is quite trivial and simple,
I think it's a good deal here.

Another "solution" is to declare we request any program that uses
libcryptolink use the compatible delete to the new used in
libcryptolink and do nothing specific. If you prefer this approach, I
don't necessarily object to it at this moment; do this that way and
move forward for now, and discuss it at the dev list and/or biweekly
call. The risk I'm concerned about is not an immediate one, so we
have time to decide (and completing the whole TSIG things sooner would
be more important).

comment:29 Changed 8 years ago by jinmei

And finally, about the "-y" portability.

My suggestion is to pick up either dig or drill format and complete this
branch, and open a new ticket to add support for the other format.

Technically, there's an unsolvable ambiguity, but in practice it
should be able to detect which format is used run time. It will make
the code a bit more complicated, but if it helps attract both dig and
drill users that would be worth introducing. In any case, I think it
doesn't have to be implemented right now.

comment:30 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:31 in reply to: ↑ 27 Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Looking at it more closely, and based on the observation that hash
algorithms are not specific to HMAC (for example, we may want to
replace our homemade sha1 implementation using this framework, or we
may want to integrate it to this framework), I'd suggest moving
HashAlgorithm? to hmaclink.h (or a separate crypto_hash.h or
something). Then, in cryptolink.h, we can only put a forward
declaration of 'class HMAC' and avoid including crypto_hmac.h.

I've moved it up to cryptolink.h; if we do decide to implement hashes through this it should probably be moved there then, but as a general identifier list it also fits directly in the main link header imo. I did rename UNKNOWN to UNKNOWN_HASH for clarity.

Whether or not we use function-based friend (on which I don't have a
strong opinion) I think this is a good change.

Well now that this has been cleaned up i see no reason to declare all of CryptoLink? friend, so i've changed this too.

And (yet) another point I noticed while I was looking at the code: it
looks possible to change the return value of
CryptoLink::getCryptoLink() to 'const CryptoLink?&' (and then we have
to change CryptoLink::createHMAC() a const member function).

It can, the question is, should it? It kind of depends on what you consider that CryptoLink? instance to be; createHMAC could be const, but in a way such factory functions do modify the instance (since the backend does some internal memory management, again the question being do we consider that state of the cryptolink object). But it's a minor change so if you insist on it i can add it quickly :)

comment:32 follow-up: Changed 8 years ago by jelte

About the delete; *now* I see, thanks :) We were indeed talking about something different. I've added a deleteHMAC(), but left in the raw pointer return value for now. We should probably indeed raise it (for next meeting?), and make a style/practices thing out of it. We might also provide an extensive description on the wiki (or blog?:)) and point to it in the documentation

comment:33 in reply to: ↑ 32 Changed 8 years ago by jinmei

The latest code looks *mostly* okay. Two points:

  • I'd suggest adding a brief description for enum HashAlgorithm?
  • According to the sense of introducing the deleter method for HMAC, it's better to use it in crypto_unittests.cc. That is, use shared_ptr with the deleter instead of scoped_ptr (in my understanding we cannot specify a deleter for scoped_ptr):
    // Instead of this...
    //      boost::scoped_ptr<HMAC> hmac_sign(crypto.createHMAC(secret,
    //                                                          secret_len,
    //                                                          hash_algorithm));
    
    // do this:
            boost::shared_ptr<HMAC> hmac_sign(crypto.createHMAC(secret,
                                                                secret_len,
                                                                hash_algorithm),
                                              deleteHMAC);
    

These should be quite trivial and I don't think we need another
cycle of review for them. After fixing these please free to merge it.

As for the constness of getCryptoLink(), createHMAC(), I'm okay with
leaving them non const. In fact, I also thought about it in terms of
const semantics; if we consider a CryptoLink? class encapsulating the
backend crypto engine with all its internal states, createHMAC() would
better be defined as non const. But the actual implementation only
stores the initialization object, which should be intact once the
initialization is completed. That's why I thought it made sense to
constify it. But on the second thought, in reality createHMAC()
(possibly) modifies some internal state behind the cryptolink, so it
would probably be better not to give a false impression about safety
of the use of createHMAC(), etc.

You didn't mention the "-y option" format, but as I already responded
I'm okay with leaving it open for now.

comment:34 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:35 Changed 8 years ago by jinmei

FYI (don't worry, this is not my usual one more additional comment:-)

I've started making #812 reviewable, assuming the important part of this
branch is now fixed. My first step was to merge #781 and a new branch
based on a recent master, and it required some straightforward but
a non negligible amount of work due to the recent re-organization with
the introduction of libutil. So I first fixed the conflicts and made
necessary fixups. They are available in

commit 00e36b253bd643b007f89a8ccb3addac1a723eba
commit 65874313065ddb02756d5ef06c8802c7540c17c7

You may be able to save your time when you merge the branch into
master by simply merging these fixes, too. (but you may find it
rather simpler to do it yourself - they are basically boring, trivial
fixes)

comment:36 Changed 8 years ago by shane

  • Feature Depending on Ticket set to tsig

comment:37 Changed 8 years ago by jelte

  • Resolution set to complete
  • Status changed from reviewing to closed

Merged, closing ticket

Note: See TracTickets for help on using tickets.