Opened 9 years ago

Closed 9 years ago

#347 closed enhancement (complete)

Implement query counters in auth module

Reported by: naokikambe Owned by: y-aharen
Priority: medium Milestone: y2 12 month milestone
Component: b10-auth Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: 0.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

Goals of this ticket (the reporter thinks) are:

  • We implement query counter in auth module.
  • Auth module sends the stats the counts out periodically like once every X seconds, or even minutes. ('Asio timers' can help out here?)
  • We add a optional command on the command channel that tells auth to send it out right away.
  • We avoid auth module hanging for seconds when stats module suddenly disappeared.
  • Currently we don't care about 'orphan message' It's for example a reply message which is never processed by any modules.

Original messages of this topic are:
https://lists.isc.org/pipermail/bind10-dev/2010-September/001286.html
https://lists.isc.org/pipermail/bind10-dev/2010-September/001315.html

Subtickets

Change History (19)

comment:1 Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to y-aharen
  • Status changed from new to assigned

Current implement of the stats module is under [source:/branches/trac191-rebased](not yet merged with trunk). It works against the auth module. Please be free to refer it and document StatsModule.

comment:2 Changed 9 years ago by naokikambe

  • Owner changed from y-aharen to naokikambe

This ticket should be owned by A-team because it is the feature on the b10-auth. Kambe in A-team is working on this now. So the owner is changed into him.

comment:3 in reply to: ↑ description Changed 9 years ago by naokikambe

Replying to naokikambe:

  • We avoid auth module hanging for seconds when stats module suddenly disappeared.

This means to implement 'timout' to interrupt sending stats module when a specified time is expired.

comment:4 follow-up: Changed 9 years ago by naokikambe

  • Owner changed from naokikambe to jinmei
  • Status changed from assigned to reviewing

We finished coding and committing source codes at r3686. It is ready for reviewing.

Jinmei-san: Could you review it?

comment:5 in reply to: ↑ 4 ; follow-ups: Changed 9 years ago by jinmei

Replying to naokikambe:

We finished coding and committing source codes at r3686. It is ready for reviewing.

Jinmei-san: Could you review it?

As explicitly asked and told it was urgent I've given it a higher priority for a while. Review isn't completed yet, but here's some initial feedback.

general comments

  • Documentation: It's generally well written, but I'd like to see description about exceptions, that is, whether each method can throw an exception, and if yes, when.
  • If you don't do propset for $Id$, this line should be removed (in git, it will be meaningless "random" string, so we should probably deprecate this convention, at least for newly created files)

stats.h

  • I made some trivial (I think) editorial changes and commit it to the branch (r3689). Please check them.
  • The class name "Counter" sounds like representing a single counter and counter intuitive.
  • I'd like to see more in the class description, including how it generally works and how we are expected to use it.
  • Is this query specific? Except the counter names there doesn't seem to be anything specific to queries (or in that sense even specific to an authoritative server). If the intent is to extend the usage to other types of statistic counters eventually, it should be described accordingly.
  • setStatsSession(): I guess most of the points noted for AuthSrv::setXfrinSession() should apply.
  • getCallback(): I think this method naming is too restrictive, and I'd suggest renaming it to, e.g., getSubmitFunc(). The reason is below.

Whether it's used as a callback (e.g. from a timer) is a matter of caller, and the Counter class doesn't have to care about it. All the responsibility of this method is to return a callable functor object that when called submit the collected statistics through the session channel (if provided via setStatsSession()). In fact, b10-auth could always get the functor and submit the statistics by its own, right? (for example, it may want to submit the final statistics before it shuts down). In general, if some public part of a class can be independent of how it can be used, it's generally better to make that point explicit so that the class is more cohesive (and less coupled with its users).

For the same reason, I think it's better to change the return type to some more generic functor type (and I believe it's easy).

And for the same reason, it's okay (and useful) to mention its intentional usage as a hint, but it should also be described that it can be used in other ways.

stats.cc

  • why do we need a separate namespace? Just following what asio_link did? If so, the intent was to make it a separate library belonging to a separate namespace eventually. Unless the stat module has the same intent, I'd suggest not having the separate namespace for now (we should probably need a common "auth" namespace, although that's a different topic)
  • naming: it would be better to use consistent terminology, i.e. if we need and use a namespace named "statistics", the header/implementation files should better be called "statistics.{h,cc}", and vice versa. I personally prefer (especially in C++) complete words such as "statistics" rather than the geeky abbreviation like "stats", but it's generally a matter of taste. (BTW: the redundancy is also related to the question of why we need a separate name space.)
  • CounterImpl? declarations: getCallback() and sendStats() could/should be const member functions.
  • CounterImpl? constructor: it seems ad hoc to pass "verbose_mode" this way. It's also not a good practice in that it ties the Counter(Impl) class with the AuthSrv? class through an implicit interface (i.e. not via public methods). I can understand if this is a short term workaround until we have a well defined logging framework, however. If so, please explicitly note this in the doxygen comment. If not, and you think this is a good technique, we need to discuss it because I'd strongly disagree:-)
  • CounterImpl? constructor: why dynamically allocating counters_? I don't see the need for it.
  • CounterImpl::sendStats()
  • CounterImpl::sendStats() try-catch part
    • group_sendmsg() (not recv) can also throw. should this be caught or ignored here? probably it should, because we wouldn't like to kill b10-auth just due to some communication failure (though it may be quite rare). In this sense, I suspect the comments should be positioned closer to the group_recvmsg() call.
    • group_recvmsg() can throw (at least) two types of exceptions: SessionTimeout? when it times out; SessionError? for other errors, including when receiving bogus data. I guess we'd like to catch both for the same reason as that for group_sendmsg().
    • The comment and log messages assume that the other end is the b10-stats program, but interface-wise it's not necessarily correct (that could be a third-party app that implements the stat API, right?)
    • Considering these, my suggestion of this part is as follows:
          // group_{send,recv}msg() can throw an exception when encountering
          // an error, and group_recvmsg() will throw an exception on timeout.
          // We don't want to kill the main server just due to this, so we
          // handle them here.
          try {
              const int seq =
                  stats_session_->group_sendmsg(set_stats, "Stats");
              if (verbose_mode_) {
                  std::cerr << "[b10-auth] send statistics data" << std::endl;
              }
      
              // TODO: parse and check response from the statistics recipient;
              // in case of b10-stats, currently it just returns empty message.
              isc::data::ConstElementPtr env, answer;
              stats_session_->group_recvmsg(env, answer, false, seq);
          } catch (const isc::data::SessionError& ex) {
              if (verbose_mode_) {
                  std::cerr << "[b10-auth] "
                            << "timeout happened while sending statistics data: "
                            << ex.what() << std::endl;
              }
          } catch (const isc::data::SessionTimeout& ex) {
              // catch the exception and do nothing.
              if (verbose_mode_) {
                  std::cerr << "[b10-auth] "
                            << "communication error in sending statistics data: "
                            << ex.what() << std::endl;
              }
          }
      
    • BTW, even if it doesn't cause timeout and exception, we should actually not even block in group_recvmsg (and also group_sendmsg(), which can happen because it's a connected channel): b10-auth shouldn't stop handling DNS queries even for 1 second. Since it should relatively a rare event we can live with the current blocking mode (and there are actually other parts of b10-auth that have the same problem). But please leave a TODO comment on this point. We should eventually solve it.
  • CounterImpl::sendStats(): the "catch" case doesn't seem to be covered by the tests
  • Counter::getCallback(): this could/should be a const member function.

comment:6 in reply to: ↑ 5 Changed 9 years ago by jinmei

Replying to jinmei:

Self follow-up:

  • getCallback(): I think this method naming is too restrictive, and I'd suggest renaming it to, e.g., getSubmitFunc(). The reason is below.

On further look, it doesn't even seem to have to be a functor. We could define a callback exactly where it's used, i.e., (in the current implementation) in main.cc, and other methods could just be straightforward normal functions:

in main.cc

void
statsTimerCallback(AuthSrv* server) {
    // we may have to care the case where 'server' is invalid
    server->submitStatistics(); // instead of getStatsCallback()
}
...
        itimer->setupTimer(boost::bind(statsTimerCallback, auth_server),
                           STATS_SEND_INTERVAL_SEC);

AuthSrv::submitStatistics() would simply call counter->submitStatistics() (renamed from getCallback()), and Counter::submitStatistics() would do what CounterImpl::sendStats() currently does.

IMO this is a much better design.

comment:7 Changed 9 years ago by jinmei

  • Owner changed from jinmei to naokikambe

Okay, I've completed my review. Comments below. Giving the ticket back to Kambe-san.

general

  • The Counter class should have its own tests.

auth_srv.h

  • I don't think this have to be a doxygen comment:
    /// for the typedef of asio_link::IntervalTimer::Callback
    
    • BTW, if we change the callback scheme as I suggested, we don't have include asio_link.h/stats.h. (Counter can be a forward declaration, and shouldn't be here anyway. see below)
  • Basically up to you, but this comment seems trivial and I wouldn't bother to mention it:
    #include <auth/stats.h>
    
  • getStatsCallback(): see the callback scheme discussion.
  • Member variable "counter" breaks the spirit of pimpl. The comment also does not seem to make sense to me. Whether it's incremented in a member function of AuthSrv? or AuthSrvImpl?, "counter" can be defined in AuthSrvImpl?. The current implementation also is exception unsafe due to this variable. See below. In any case "counter" should be hidden in impl_. BTW, to be consistent about naming, I'd rename it to "counter_".

auth_srv.cc

  • AuthSrv? constructor: this is not exception safe. If the allocation of counter fails memory for impl_ will leak. Due to this, and for other reasons also, counter should in impl_. BTW: it also doesn't make sense to me to dynamically allocate Counter, especially if/when we hide it in .cc. Unless this is optional (in the current implementation it's not) AuthSrvImpl? can simply contain a Counter object.
  • AuthSrv::processMessage(): in which case we can see getProtocol() be neither UDP nor TCP? In general, I don't like to handle an impossible case this way. I'd either throw an exception or even assert() here.
  • AuthSrv::getStatsCallback(): the comment seems to me so trivial that I wouldn't bother to mention it:-)

auth.spec.pre.in

  • as pointed out before, it's not necessarily "b10-stats":
    +        "command_description": "Send statistics data to b10-stats at once",
    
    I'd say, e.g., "...to a statistics module at once".

asio_link.h

  • I'd like to see more description for the IntervalTimer? class, including how it generally works and how it should be used (sample code may help).
  • I don't understand this statement:
    +/// Copy of this class is prohibited not to call the callback function twice.
    
    Could the callback function be called twice if we allowed copy? (I don't object to disabling the copy per se). BTW, being pedantic, we don't copy a "class"; we copy an object (or instance) of a class.
  • IntervalTimer? constructor: passing (and exposing as a result) asio::io_service(&) breaks the intent of this wrapper and is not a good practice. The only place (as a short-to-middle term workaround) we can see it is in the cc::Session constructor. No other code except that within the asio_link wrapper should directly refer to it, even as a reference. Please update the interface so that it takes IOService(&) instead of asio::io_service(&). I believe you can do it with a trivial change.
  • IntervalTimer? destructor: it should explain it cancels a running timer. BTW: what happens if we destruct IntervalTimer? before starting a timer?
  • setupTimer(): this should explain the expected signature of cbfunc. See, for example, the asio documentation: http://think-async.com/Asio/asio-1.4.5/doc/asio/reference/basic_deadline_timer/async_wait.html It should also explain it doesn't only register the callback but also actually starts the timer.
  • setupTimer(): should it return a (boolean) value? It doesn't seem to return no other value than true. If so, it's better to make it void.

asio_link.cc

  • It'd be helpful if IntervalTimerImpl? could have more description. See other pimpl classes defined in asio_link.cc.
  • IntervalTimerImpl::setupTimer(): IMO this use of assert is wrong. In this case we should use exception, and document the restriction, and test the case.
  • IntervalTimerImpl::callback(): Likewise, this assert seems to be wrong because, if I understand the code correctly, the application can (incorrectly) pass an empty functor via setupTimer(). If want to make sure this doesn't happen, we should require it at the time of setupTimer(), throw an exception otherwise, document it, and test it.

asio_link_unittest.cc

  • doTimerTest() doesn't seem to have to be in the ASIOLinkTest class (because it's small and is only used by a single test). Placing it in a different place makes the code unreadable.
  • doTimerTest(): I'd like to make sure the timer expires with the given interval, although it could be optional.
  • timerCallBack() doesn't have to be a separate function (it can be embedded in TimerCallBack::operator())
  • ASIOLinkTest::startIntervalTimer(): why is itimer allocated dynamically. It makes (at least in theory) the code exception unsafe, and we can actually simply avoid the dynamic allocation:
        IntervalTimer itimer(io_service_->get_io_service());
        doTimerTest(&itimer);
    
    btw, you can omit the namespace "asio_link::" because we declare 'using namespace' for it.
  • I'd like to see more test cases, including
    • Invalid arguments (see the comments on IntervalTimerImpl?)
    • what happens if we delete the timer
    • call setupTimer multiple times, with different arguments

auth_srv_unittest.cc

  • as commented, we need a separate test class for Counter (shouldn't be "in future", especially because the tests are incomplete - see below).
  • AuthSrvTest::sendStatsWithoutSession() This is not a good unit test because a human needs to examine the output. Modify the interface and use EXPECT_xx.
  • AuthSrvTest::counter_incUDP/TCP(): these don't do any tests. In terms of auth server, we should test by sending a "fake" UDP/TCP message and see if the corresponding counter is incremented. For the more focused tests of Counter, we directly increment the counters and check the results. For that matter, we'll need to an inteface to retrieve the specified counter value in the Counter class (which would be good anyway). BTW, using _ in the test name effectively breaks our naming rule. I'd rename it to inc[rement]CounterUDP, etc. Same comment applies to some other cases.
  • AuthSrvTest::counter_sendStatsWithSession(): this test is insufficient. We should define an appropriate MockSession?, which intercepts the submitted message, parses and stores it, and then we should check the submitted values are correct using EXPECT_xxx.

main.cc

  • my_command_handler()
    • in this context auth_server shouldn't be NULL. I'd rather assert it or assume it's valid.
  • main()
    • I have comments/suggestions about timer callbacks. See above.

ChangeLog?

  • Shouldn't Kambe-san be listed as a contributor, too? (Just checking.)
  • It would be nice if it could briefly explain how we can see the collected statistics (via bindctl perhaps).

comment:8 in reply to: ↑ 5 ; follow-up: Changed 9 years ago by y-aharen

  • Owner changed from naokikambe to jinmei

Jinmei-san, thank you very much for your review. Before fixing the code and committing to branch, I have few points to be clear.

Replying to jinmei:

stats.cc

  • The comment and log messages assume that the other end is the b10-stats program, but interface-wise it's not necessarily correct (that could be a third-party app that implements the stat API, right?)

Counter in Auth module is not intended to communicate with third-party stats module. It only communicates with b10-stats.

  • BTW, even if it doesn't cause timeout and exception, we should actually not even block in group_recvmsg (and also group_sendmsg(), which can happen because it's a connected channel): b10-auth shouldn't stop handling DNS queries even for 1 second. Since it should relatively a rare event we can live with the current blocking mode (and there are actually other parts of b10-auth that have the same problem). But please leave a TODO comment on this point. We should eventually solve it.

I don't think that blocking in group_{send,recv}msg also blocks handling of DNS queries. I understand blocking read/write operation stops in the ASIO library, and if another event arrives ASIO starts handling it. Excuse me if I misunderstood.

comment:9 in reply to: ↑ 8 Changed 9 years ago by jinmei

  • Owner changed from jinmei to y-aharen

Replying to y-aharen:

stats.cc

  • The comment and log messages assume that the other end is the b10-stats program, but interface-wise it's not necessarily correct (that could be a third-party app that implements the stat API, right?)

Counter in Auth module is not intended to communicate with third-party stats module. It only communicates with b10-stats.

Hmm, this is different from my general understanding (general = not just about statistics): the protocol among various BIND 10 modules is open, and any single module can be replaced (and should be replaceable) with other implementation, as long as it supports the inter module protocol. For example, we can replace b10-xfrin with a third party xfrin process without affecting how the entire BIND 10 system behaves. And, there shouldn't be a reason why statistics is an exception.

It's possible my understanding is incorrect, however, so, if you don't agree please bring it to bind10-dev.

  • BTW, even if it doesn't cause timeout and exception, we should actually not even block in group_recvmsg (and also group_sendmsg(), which can happen because it's a connected channel): b10-auth shouldn't stop handling DNS queries even for 1 second. Since it should relatively a rare event we can live with the current blocking mode (and there are actually other parts of b10-auth that have the same problem). But please leave a TODO comment on this point. We should eventually solve it.

I don't think that blocking in group_{send,recv}msg also blocks handling of DNS queries. I understand blocking read/write operation stops in the ASIO library, and if another event arrives ASIO starts handling it. Excuse me if I misunderstood.

This is not true for group_sendmsg(), because it uses a blocking primitive of ASIO (asio::write()).

I actually thought it also applied to group_recvmsg(), but after re-checking the code I found (or rediscovered) that it uses a tricky asynchronous loop with a timer (internally calling asio::run_one()), so, you seem to be right; the internal loop will handle other events while waiting for the stats response.

But I suspect it was a non intended side effect, and the original intent was to introduce a timeout for a blocking read operation (if you want to be sure, we should ask Jelte, who implemented it). Actually, calling run_one() in an event handler of asio::run() don't seem to be safe in general; there seems to be possible race condition in handling a lock for the shared io_message object. Besides, in my understanding we'll eventually integrate this code into the generic asio_link wrapper, at which point this will probably be a blocking operation again.

So, basically, I think my point still stands. Note, however, that I didn't request to solve this problem. This is just a note, and my suggestion was to leave some comments for future fixes.

comment:10 in reply to: ↑ 5 ; follow-up: Changed 9 years ago by y-aharen

  • Owner changed from y-aharen to jinmei

Jinmei-san, thank you very much for your review!

Replying to jinmei:

general comments

  • Documentation: It's generally well written, but I'd like to see description about exceptions, that is, whether each method can throw an exception, and if yes, when.

OK. I described it.

  • If you don't do propset for $Id$, this line should be removed (in git, it will be meaningless "random" string, so we should probably deprecate this convention, at least for newly created files)

OK. I did propset.

stats.h

  • I made some trivial (I think) editorial changes and commit it to the branch (r3689). Please check them.

I saw it. It seems to be good. Thank you.

  • The class name "Counter" sounds like representing a single counter and counter intuitive.
  • I'd like to see more in the class description, including how it generally works and how we are expected to use it.
  • Is this query specific? Except the counter names there doesn't seem to be anything specific to queries (or in that sense even specific to an authoritative server). If the intent is to extend the usage to other types of statistic counters eventually, it should be described accordingly.

"Counter" class holds a set of query counters. Thus, I renamed it to
"QueryCounters?". Is it OK? Currently it holds only "UDP" and "TCP"
type, but we may extend other types like "Notify", "A/Ixfr" type and
so on in the future. I described it as a comment in the code.

  • setStatsSession(): I guess most of the points noted for AuthSrv::setXfrinSession() should apply.

OK. I described it.

  • getCallback(): I think this method naming is too restrictive, and I'd

suggest renaming it to, e.g., getSubmitFunc(). The reason is below.

Whether it's used as a callback (e.g. from a timer) is a matter of

caller, and the Counter class doesn't have to care about it. All the
responsibility of this method is to return a callable functor object that
when called submit the collected statistics through the session channel
(if provided via setStatsSession()). In fact, b10-auth could always get
the functor and submit the statistics by its own, right? (for example, it
may want to submit the final statistics before it shuts down). In
general, if some public part of a class can be independent of how it can
be used, it's generally better to make that point explicit so that the
class is more cohesive (and less coupled with its users).

For the same reason, I think it's better to change the return type to

some more generic functor type (and I believe it's easy).

I almost adopted your next suggestion 6.

And for the same reason, it's okay (and useful) to mention its

intentional usage as a hint, but it should also be described that it can
be used in other ways.

OK. I described it.

stats.cc

  • why do we need a separate namespace? Just following what asio_link did? If so, the intent was to make it a separate library belonging to a separate namespace eventually. Unless the stat module has the same intent, I'd suggest not having the separate namespace for now (we should probably need a common "auth" namespace, although that's a different topic)

OK. I removed "statistics" namespace and will add common namespace
"auth" in the future. It was described as TODO.

  • naming: it would be better to use consistent terminology, i.e. if we need and use a namespace named "statistics", the

header/implementation

files should better be called "statistics.{h,cc}", and vice versa.
I personally prefer (especially in C++) complete words such as
"statistics" rather than the geeky abbreviation like "stats", but
it's generally a matter of taste.
(BTW: the redundancy is also related to the question of why we need
a separate name space.)

OK. I renamed the file name to "statistics.{h,cc}".

  • CounterImpl? declarations: getCallback() and sendStats() could/should be const member functions.

OK. I did it.

  • CounterImpl? constructor: it seems ad hoc to pass "verbose_mode" this way. It's also not a good practice in that it ties the Counter(Impl) class with the AuthSrv? class through an implicit interface (i.e. not via public methods). I can understand if this is a short term workaround until we have a well defined logging framework, however. If so, please explicitly note this in the doxygen comment. If not, and you think this is a good technique, we need to discuss it because I'd strongly disagree:-)

I agree with you. I noted it as TODO.

  • CounterImpl? constructor: why dynamically allocating counters_? I don't see the need for it.

OK, it doesn't need to be. I modified it to allocate statically.

  • CounterImpl::sendStats()
  • CounterImpl::sendStats() try-catch part
    • group_sendmsg() (not recv) can also throw. should this be caught or ignored here? probably it should, because we wouldn't like to kill b10-auth just due to some communication failure (though it may be quite rare). In this sense, I suspect the comments should be positioned closer to the group_recvmsg() call.
    • group_recvmsg() can throw (at least) two types of exceptions: SessionTimeout? when it times out; SessionError? for other errors, including when receiving bogus data. I guess we'd like to catch both for the same reason as that for group_sendmsg().

OK. I modified the code to catch these exceptions as suggested below.

  • The comment and log messages assume that the other end is the b10-stats program, but interface-wise it's not necessarily correct (that could be a third-party app that implements the stat API, right?)

I agree with your reply in jinmei.

  • Considering these, my suggestion of this part is as follows:

Thank you. It seemed be good suggestion. I adopt it.

    // group_{send,recv}msg() can throw an exception when encountering
    // an error, and group_recvmsg() will throw an exception on timeout.
    // We don't want to kill the main server just due to this, so we
    // handle them here.
    try {
        const int seq =
            stats_session_->group_sendmsg(set_stats, "Stats");
        if (verbose_mode_) {
            std::cerr << "[b10-auth] send statistics data" << std::endl;
        }

        // TODO: parse and check response from the statistics recipient;
        // in case of b10-stats, currently it just returns empty message.
        isc::data::ConstElementPtr env, answer;
        stats_session_->group_recvmsg(env, answer, false, seq);
    } catch (const isc::data::SessionError& ex) {
        if (verbose_mode_) {
            std::cerr << "[b10-auth] "
                      << "timeout happened while sending statistics data:
"
                      << ex.what() << std::endl;
        }
    } catch (const isc::data::SessionTimeout& ex) {
        // catch the exception and do nothing.
        if (verbose_mode_) {
            std::cerr << "[b10-auth] "
                      << "communication error in sending statistics data:
"
                      << ex.what() << std::endl;
        }
    }
  • BTW, even if it doesn't cause timeout and exception, we should actually not even block in group_recvmsg (and also group_sendmsg(), which can happen because it's a connected channel): b10-auth shouldn't stop handling DNS queries even for 1 second. Since it should relatively a rare event we can live with the current blocking mode (and there are actually other parts of b10-auth that have the same problem). But please leave a TODO comment on this point. We should eventually solve it.

I agree with your reply in jinmei. I wrote a TODO comment which
describes the problem.

  • CounterImpl::sendStats(): the "catch" case doesn't seem to be covered by the tests

OK. I wrote new tests in statistics_unittest.cc.

  • Counter::getCallback(): this could/should be a const member function.

OK. getCallback() no longer exists.

Replying to jinmei:

Comment(by jinmei):

Replying to jinmei:

Self follow-up:

  • getCallback(): I think this method naming is too restrictive, and I'd

suggest renaming it to, e.g., getSubmitFunc(). The reason is below.

On further look, it doesn't even seem to have to be a functor. We could
define a callback exactly where it's used, i.e., (in the current
implementation) in main.cc, and other methods could just be
straightforward normal functions:

in main.cc

void
statsTimerCallback(AuthSrv* server) {
    // we may have to care the case where 'server' is invalid
    server->submitStatistics(); // instead of getStatsCallback()
}
...
        itimer->setupTimer(boost::bind(statsTimerCallback, auth_server),
                           STATS_SEND_INTERVAL_SEC);

AuthSrv::submitStatistics() would simply call counter->submitStatistics()
(renamed from getCallback()), and Counter::submitStatistics() would do
what CounterImpl::sendStats() currently does.

IMO this is a much better design.

I agree with you. I adopted it.

Replying to jinmei:

general

  • The Counter class should have its own tests.

OK. QueryCounters? (renamed from Counter) has its own tests in
statistics_unittest.cc.

auth_srv.h

  • I don't think this have to be a doxygen comment:
    /// for the typedef of asio_link::IntervalTimer::Callback
    
    • BTW, if we change the callback scheme as I suggested, we don't have include asio_link.h/stats.h. (Counter can be a forward declaration, and shouldn't be here anyway. see below)
  • Basically up to you, but this comment seems trivial and I wouldn't bother to mention it:
    #include <auth/stats.h>
    

OK. As suggested above I changed the callback scheme and these includes
are not necessary now.

  • getStatsCallback(): see the callback scheme discussion.

OK. This function no longer exists.

  • Member variable "counter" breaks the spirit of pimpl. The comment also does not seem to make sense to me. Whether it's incremented in a member function of AuthSrv? or AuthSrvImpl?, "counter" can be defined in AuthSrvImpl?. The current implementation also is exception unsafe due to this variable. See below. In any case "counter" should be hidden in impl_. BTW, to be consistent about naming, I'd rename it to "counter_".

OK. I renamed it to "counters_" and it is now inside AuthSrvImpl?.

auth_srv.cc

  • AuthSrv? constructor: this is not exception safe. If the allocation of counter fails memory for impl_ will leak. Due to this, and for other reasons also, counter should in impl_. BTW: it also doesn't make sense to me to dynamically allocate Counter, especially if/when we hide it in .cc. Unless this is optional (in the current implementation it's not) AuthSrvImpl? can simply contain a Counter object.

OK. I modified it to allocate statically (and it is now in AuthSrvImpl?).

  • AuthSrv::processMessage(): in which case we can see getProtocol() be neither UDP nor TCP? In general, I don't like to handle an impossible case this way. I'd either throw an exception or even assert() here.

I agree with you. I modified to throw an exception.

  • AuthSrv::getStatsCallback(): the comment seems to me so trivial that I wouldn't bother to mention it:-)

OK. AuthSrv::getStatsCallback() no longer exists.

auth.spec.pre.in

  • as pointed out before, it's not necessarily "b10-stats":
    +        "command_description": "Send statistics data to b10-stats at once",
    
    I'd say, e.g., "...to a statistics module at once".

OK. I almost adopt it.

asio_link.h

  • I'd like to see more description for the IntervalTimer? class, including how it generally works and how it should be used (sample code may help).

OK. I described sample code and how it works.

  • I don't understand this statement:
    +/// Copy of this class is prohibited not to call the callback function twice.
    
    Could the callback function be called twice if we allowed copy? (I don't object to disabling the copy per se). BTW, being pedantic, we don't copy a "class"; we copy an object (or instance) of a class.

I misunderstood, it can't be called twice. I removed the comment.
But I leave the class non-copyable.

  • IntervalTimer? constructor: passing (and exposing as a result) asio::io_service(&) breaks the intent of this wrapper and is not a good practice. The only place (as a short-to-middle term workaround) we can see it is in the cc::Session constructor. No other code except that within the asio_link wrapper should directly refer to it, even as a reference. Please update the interface so that it takes IOService(&) instead of asio::io_service(&). I believe you can do it with a trivial change.

I agree with you. The constructor now receives a reference to IOService.

  • IntervalTimer? destructor: it should explain it cancels a running timer. BTW: what happens if we destruct IntervalTimer? before starting a timer?

I thought it is required to explicitly cancel the timer before destruction.
I removed the code to cancel the timer because it is cancelled
in the destruction of deadline_timer (I wrote a comment).

OK. I described them.

  • setupTimer(): should it return a (boolean) value? It doesn't seem to return no other value than true. If so, it's better to make it void.

OK. I made it void.

asio_link.cc

  • It'd be helpful if IntervalTimerImpl? could have more description. See other pimpl classes defined in asio_link.cc.

I don't see much description in *Impl classes in asio_link.cc, but
I added some comments to IntervalTimerImpl?.

  • IntervalTimerImpl::setupTimer(): IMO this use of assert is wrong. In this case we should use exception, and document the restriction, and test the case.

OK. I modified to throw the exception, documented the restriction
(interval should be greater than 0, call back function should not
be empty), and wrote a test case.

  • IntervalTimerImpl::callback(): Likewise, this assert seems to be wrong because, if I understand the code correctly, the application can (incorrectly) pass an empty functor via setupTimer(). If want to make sure this doesn't happen, we should require it at the time of setupTimer(), throw an exception otherwise, document it, and test it.

OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
its arguments.

asio_link_unittest.cc

  • doTimerTest() doesn't seem to have to be in the ASIOLinkTest class (because it's small and is only used by a single test). Placing it in a different place makes the code unreadable.

OK. I moved the codes into TEST_F(ASIOLinkTest, startIntervalTimer).
doTimerTest() no longer exists.

  • doTimerTest(): I'd like to make sure the timer expires with the given interval, although it could be optional.

OK. Some tests requires the timer expires with the given interval, so
I wrote a test case. The test allows interval has a mergin
(currently +/- 50 milliseconds).

  • timerCallBack() doesn't have to be a separate function (it can be embedded in TimerCallBack::operator())

OK. I embedded the codes to TimerCallBack::operator()().

  • ASIOLinkTest::startIntervalTimer(): why is itimer allocated dynamically. It makes (at least in theory) the code exception unsafe, and we can actually simply avoid the dynamic allocation:
        IntervalTimer itimer(io_service_->get_io_service());
        doTimerTest(&itimer);
    
    btw, you can omit the namespace "asio_link::" because we declare 'using namespace' for it.

OK. Almost all of IntervalTimers? are now allocated statically, except in
destructIntervalTimer. In the test code we have to delete it.
I enclosed it by ASSERT_NO_THROW, is that OK?

  • I'd like to see more test cases, including
    • Invalid arguments (see the comments on IntervalTimerImpl?)
    • what happens if we delete the timer
    • call setupTimer multiple times, with different arguments

OK. I wrote some test cases: invalidArgumentToIntervalTimer,
destructIntervalTimer and overwriteIntervalTimer.

auth_srv_unittest.cc

  • as commented, we need a separate test class for Counter (shouldn't be "in future", especially because the tests are incomplete - see below).

OK. I wrote a seperated test case for QueryCounters?.

  • AuthSrvTest::sendStatsWithoutSession() This is not a good unit test because a human needs to examine the output. Modify the interface and use EXPECT_xx.

OK. AuthSrvTest::sendStatsWithoutSession() no longer exists.
I modified submitStatistics() to return bool and test it is 'true'.

  • AuthSrvTest::counter_incUDP/TCP(): these don't do any tests. In terms of auth server, we should test by sending a "fake" UDP/TCP message and see if the corresponding counter is incremented. For the more focused tests of Counter, we directly increment the counters and check the results. For that matter, we'll need to an inteface to retrieve the specified counter value in the Counter class (which would be good anyway). BTW, using _ in the test name effectively breaks our naming rule. I'd rename it to inc[rement]CounterUDP, etc. Same comment applies to some other cases.

OK. I renamed them and wrote some test cases to adopt your suggestion.

  • AuthSrvTest::counter_sendStatsWithSession(): this test is insufficient. We should define an appropriate MockSession?, which intercepts the submitted message, parses and stores it, and then we should check the submitted values are correct using EXPECT_xxx.

OK. I wrote a test case that parses and examines the submitted message.

main.cc

  • my_command_handler()
    • in this context auth_server shouldn't be NULL. I'd rather assert it or assume it's valid.

OK. I added a code to assert it is not NULL.

  • main()
    • I have comments/suggestions about timer callbacks. See above.

OK. I modified as you have suggested.

ChangeLog?

  • Shouldn't Kambe-san be listed as a contributor, too? (Just checking.)

He says not.

  • It would be nice if it could briefly explain how we can see the collected statistics (via bindctl perhaps).

OK. I added a part of the result of the command 'sendstats'.

Thank you again for your great feedback.

comment:11 Changed 9 years ago by jinmei

test update. please ignore.

comment:12 in reply to: ↑ 10 ; follow-up: Changed 9 years ago by jinmei

  • Owner changed from jinmei to y-aharen

Replying to y-aharen:

Overall, the revised branch looks much better. I have some non
trivial comments though, which are below.

general comment

  • I made some trivial suggested changes (r3904). Please check them out (and override any of them if necessary)
  • While I personally prefer "statistics" over "stats", consistency is even more important. Right now there is a mixture of "xxxStatsyyy" and "xxxStatisticsyyy" (e.g. setStatsSession() vs submitStatistics()), which is not good. If we name something "xxxStatistics", everything else should be named with "Statistics" (not stats); if we name it "xxxStats", others should also be named with "Stats". The choice is ultimately yours, but it should be consistent. I've fixed one obvious issue with this regard directly in the branch: header guard definition (STAT(ISTIC)S_H).

statistics.h

  • now there's no need to include boost/function.hpp. I've removed it.
  • as for the class name:

"Counter" class holds a set of query counters. Thus, I renamed it to
"QueryCounters?". Is it OK? Currently it holds only "UDP" and "TCP"
type, but we may extend other types like "Notify", "A/Ixfr" type and
so on in the future. I described it as a comment in the code.

My concern is that with the intent of extension we'll soon encounter
the need for renaming the class. Note that notify/ixfr/update are not
even "queries" (to be very accurate about terminology); we may want to
have other statistics counters that may not be related to DNS protocol
handling (e.g, we may want to have a counter about the number of
reloads). I don't know the future plans about these possibilities,
but unless we want to define separate classes for these different
types of statistics parameters, I'd choose a more generic name such as
AuthCounters? (but note also that as I pointed out in my previous
comment it may not even be specific to the authoritative server; we'll
naturally consider having the same framework for b10-recurse, for
example. So, eventually, we may want to have a separate library
module src/lib/statistics or something like that, where we define a
most generic "Counters" class that can be used for any kinds of
counter based statistics).

But at the same time, there's a risk of premature overabstraction, so
it's a tradeoff issue. The choice is basically up to you, but if I
were the principal author, I'd begin with "AuthCounters?" with a note
that we may eventually want to make it a separate library with a
higher abstraction level (with the cost of renaming classes and its
overhead).

statistics.cc

  • getCounters() (effectively) exposes an internal implementation details (i.e., that the counters are stored in std::vector), which is not a good practice. I'd prefer passing a counter type and retrieving that specific counter. BTW: why "shouldn't this be called except from tests" as documented in the header file? It may not be of much use in general, but I don't see the reason for prohibiting it. And, in any case, restricting usage by comment is not a reliable way to do so.
  • QueryCountersImpl::inc(): is there any reason for converting std::out_of_range to InvalidParameter?? On one hand that's generally a good practice, but we basically allow standard exceptions to be propagated through our public interfaces (unlike boost exceptions, which we generally try to hide), so we don't have to do anything special here. If there's a good reason to catch (and convert) the exception internally, we should catch it by reference, not as an object.

auth_srv.cc

  • AuthSrv::processMessage
  • AuthSrv::processMessage(): in which case we can see getProtocol() be neither UDP nor TCP? In general, I don't like to handle an impossible case this way. I'd either throw an exception or even assert() here.

I agree with you. I modified to throw an exception.

Okay, but the exception case isn't tested. Please also add
documentation about the exception in auth_srv.h (to make it better
gradually).

  • AuthSrv::processMessage another point: this method handles all incoming requests including notify and IXFR, not just normal queries. So, technically, incrementing a "query" counter here is not correct. See also the discussion about class naming above.
  • AuthSrv::submitStatistics: this should be a const member function.

asio_link.h

  • about the IntervalTimer? class description: I'm not sure whether we need this note:
    /// This class is mainly designed to use for calling
    /// \c QueryCounters::submitStatistics() periodically.
    
    What's the purpose of this note? The interface of interval timer seems quite generic and I don't see the need for mentioning the "main purpose" here.
  • IntervalTimer? constructor
    • We should document that "cbfunc" shouldn't be an empty functor.
    • Can it throw asio::system_error? I was not sure how that could happen, but if it really happens:
      • we should catch the exception internally and convert it to our own one (we need to hide any asio specific definitions, which is the purpose of this wrapper in the first place)
      • we should cover that case in tests.
  • IntervalTimer? destructor: my previous question doesn't seem to be addressed in the documentation: "what happens if we destruct IntervalTimer before starting a timer?"
  • setupTimer(): like the constructor, asio::system_error shouldn't be exposed.

asio_link.cc

  • IntervalTimerImpl::callback()

OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
its arguments.

I guess you mean "no longer exits (or asserts)"? BTW with the check
in setupTimer() I'm okay with the assert() (or may even be a good
practice, but it's up to you).

asio_link_unittest.cc

  • as a general comment, it looks like the added tests make ASIOLinkTest unnecessarily big(ger). ASIOLinkTest is already pretty complicated due to perform some tricky tests using actual socket I/O, while the timer tests don't need anything of these. It would be much better to define a different test fixture (say, IntervalTimerTest?) and enclose all timer related stuff there.
  • there's no need to construct an IntervalTimer::Callback object explicitly thanks to the magic of boost::function. e.g., instead of:
                    timer_.setupTimer(IntervalTimer::Callback(
                                          TimerCallBack(test_obj_)), 1);
    
    we could say:
                    timer_.setupTimer(TimerCallBack(test_obj_), 1);
    
    This will help make the code simpler and more readable.
  • ASIOLinkTest::startIntervalTimer()

OK. Almost all of IntervalTimers? are now allocated statically, except in
destructIntervalTimer. In the test code we have to delete it.
I enclosed it by ASSERT_NO_THROW, is that OK?

It doesn't address the concern about the exception safety. It's not
about the case where newing the timer throws an exception. It's about
an exception triggered after the new before the test terminates. One
way to make it 100% correct would be to use a smart pointer for the
itimer, but that would be probably too much in this context; if the
test throws unexpectedly, the whole program would soon exit anyway, so
the perfect exception safety wouldn't be worth the code complexity.
After making the itimers real object whenever possible, my suggestion
is to ignore the exception safety problem with an explicit comment
like: "this code isn't exception safe, but we'd rather keep the code
simpler and more readable as this is only for tests and if it throws
the program would immediately terminate anyway".

BTW, whether or not adopting this suggestion, these _NO_THROW()s don't
make sense anyway. I'd remove them.

  • the additional tests introduce another 10 seconds of delay. in general, imposing a long delay in unit tests is not a good practice as unit tests are supposed to be performed "instantly". These types of delay have been introduced in our tests already (some of them were introduced by myself), and so it's not specific to the timer tests, but 10 seconds still look quite substantial. I'd not necessarily request it be solved right now, but the timer should eventually have finer granularity for the timer duration and use shorter timer periods for tests. For the moment I suggest adding some notes about this point in comments.

statistics_unittest.cc

  • in QueryCountersTest?() verbose_mode_ should be initialized to false explicitly. Otherwise, it could cause noisy log messages during tests depending on the environment (it happened to me).
  • incrementUDPCounter: I'd test if the counter is initially 0, and if it's really incremented by one after counters.inc(). Same for incrementTCPCounter, AuthSrvTest::queryCounter{UDP,TCP}.

ChangeLog?

  • I couldn't play with the statistics using bindctl just from this description. After understanding how it works, I'd suggest revising the text as follows:
      TBD.  [func]		y-aharen
    	src/bin/auth: Added a feature to count queries and send counter
    	values to b10-stats periodically. To support it, added wrapping
    	class of asio::deadline_timer to use as interval timer.
    	The counters can be seen using the "Stats show" command from
    	bindctl.  The result would look like:
    	  ... "auth.queries.tcp": 1, "auth.queries.udp": 1 ...
    	Using the "Auth sendstats" command you can make b10-auth send
    	the counters to b10-stats immediately.
    	(Trac #347, svn rxxxx)
    
    (I used b10-stats in this context instead of generic "statistics module" because it would be more helpful for the users)

comment:13 in reply to: ↑ 12 ; follow-up: Changed 9 years ago by y-aharen

  • Owner changed from y-aharen to jinmei

Jinmei-san, thank you very much for you review.

Replying to jinmei:

Replying to y-aharen:

Overall, the revised branch looks much better. I have some non
trivial comments though, which are below.

general comment

  • I made some trivial suggested changes (r3904). Please check them out (and override any of them if necessary)

I've checked them and they looks good. Thank you.

  • While I personally prefer "statistics" over "stats", consistency is even more important. Right now there is a mixture of "xxxStatsyyy" and "xxxStatisticsyyy" (e.g. setStatsSession() vs submitStatistics()), which is not good. If we name something "xxxStatistics", everything else should be named with "Statistics" (not stats); if we name it "xxxStats", others should also be named with "Stats". The choice is ultimately yours, but it should be consistent. I've fixed one obvious issue with this regard directly in the branch: header guard definition (STAT(ISTIC)S_H).

Sorry, I forgot to change some name. I changed the names as you suggested.

statistics.h

  • now there's no need to include boost/function.hpp. I've removed it.

OK, it can be removed.

  • as for the class name:

"Counter" class holds a set of query counters. Thus, I renamed it to
"QueryCounters?". Is it OK? Currently it holds only "UDP" and "TCP"
type, but we may extend other types like "Notify", "A/Ixfr" type and
so on in the future. I described it as a comment in the code.

My concern is that with the intent of extension we'll soon encounter
the need for renaming the class. Note that notify/ixfr/update are not
even "queries" (to be very accurate about terminology); we may want to
have other statistics counters that may not be related to DNS protocol
handling (e.g, we may want to have a counter about the number of
reloads). I don't know the future plans about these possibilities,
but unless we want to define separate classes for these different
types of statistics parameters, I'd choose a more generic name such as
AuthCounters? (but note also that as I pointed out in my previous
comment it may not even be specific to the authoritative server; we'll
naturally consider having the same framework for b10-recurse, for
example. So, eventually, we may want to have a separate library
module src/lib/statistics or something like that, where we define a
most generic "Counters" class that can be used for any kinds of
counter based statistics).

But at the same time, there's a risk of premature overabstraction, so
it's a tradeoff issue. The choice is basically up to you, but if I
were the principal author, I'd begin with "AuthCounters?" with a note
that we may eventually want to make it a separate library with a
higher abstraction level (with the cost of renaming classes and its
overhead).

I think the name 'AuthCounters?' is suitable for the class, so I renamed it. I also noted the design of the class may change in the future, not only holding counter values nor specific to Auth.

statistics.cc

  • getCounters() (effectively) exposes an internal implementation details (i.e., that the counters are stored in std::vector), which is not a good practice. I'd prefer passing a counter type and retrieving that specific counter. BTW: why "shouldn't this be called except from tests" as documented in the header file? It may not be of much use in general, but I don't see the reason for prohibiting it. And, in any case, restricting usage by comment is not a reliable way to do so.

To pass 'QueryCounterType?' we need definition of it. AuthSrv? provides an interface to getCounters() so I've added '#include <auth/statistics.h>' into auth_srv.h and modified getCounter() to return single counter value of specified type.

  • QueryCountersImpl::inc(): is there any reason for converting std::out_of_range to InvalidParameter?? On one hand that's generally a good practice, but we basically allow standard exceptions to be propagated through our public interfaces (unlike boost exceptions, which we generally try to hide), so we don't have to do anything special here. If there's a good reason to catch (and convert) the exception internally, we should catch it by reference, not as an object.

I removed try-catch which converts std::out_of_range to InvalidParameter?.QueryCountersImpl::inc() throws std::out_of_range.

auth_srv.cc

  • AuthSrv::processMessage
  • AuthSrv::processMessage(): in which case we can see getProtocol() be neither UDP nor TCP? In general, I don't like to handle an impossible case this way. I'd either throw an exception or even assert() here.

I agree with you. I modified to throw an exception.

Okay, but the exception case isn't tested. Please also add
documentation about the exception in auth_srv.h (to make it better
gradually).

I wrote test case and added documentation about the exception.

  • AuthSrv::processMessage another point: this method handles all incoming requests including notify and IXFR, not just normal queries. So, technically, incrementing a "query" counter here is not correct. See also the discussion about class naming above.

OK. I moved the code to a function and it is called from
AuthSrvImpl::processNormalQuery() and AuthSrvImpl::processAxfrQuery().

  • AuthSrv::submitStatistics: this should be a const member function.

OK. I made it const member function.

asio_link.h

  • about the IntervalTimer? class description: I'm not sure whether we need this note:
    /// This class is mainly designed to use for calling
    /// \c QueryCounters::submitStatistics() periodically.
    
    What's the purpose of this note? The interface of interval timer seems quite generic and I don't see the need for mentioning the "main purpose" here.

Just to describe why I wrote this class. I reconsider it and I think the comment should not be there (it is also described in ChangeLog?). I removed it.

  • IntervalTimer? constructor
    • We should document that "cbfunc" shouldn't be an empty functor.

I added documentation.

  • Can it throw asio::system_error? I was not sure how that could happen, but if it really happens:
    • we should catch the exception internally and convert it to our own one (we need to hide any asio specific definitions, which is the purpose of this wrapper in the first place)
    • we should cover that case in tests.

I misunderstood the asio::deadline_timer library. The constructor won't throw asio::system_error. I removed the documentation.

  • IntervalTimer? destructor: my previous question doesn't seem to be addressed in the documentation: "what happens if we destruct IntervalTimer before starting a timer?"

If we destruct it before starting a timer, the callback function should not be called. I wrote a test.

  • setupTimer(): like the constructor, asio::system_error shouldn't be exposed.

I wrote try-catch to convert asio::system_error to isc::Unexpected and added documentation. It is hard to modify the library to throw an exception, so I didn't wrote a test.

asio_link.cc

  • IntervalTimerImpl::callback()

OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
its arguments.

I guess you mean "no longer exits (or asserts)"? BTW with the check
in setupTimer() I'm okay with the assert() (or may even be a good
practice, but it's up to you).

It means I removed assertion from IntervalTimerImpl::callback(). I think throwing an exception is suitable for parameter check, so the code throws an exception.

asio_link_unittest.cc

  • as a general comment, it looks like the added tests make ASIOLinkTest unnecessarily big(ger). ASIOLinkTest is already pretty complicated due to perform some tricky tests using actual socket I/O, while the timer tests don't need anything of these. It would be much better to define a different test fixture (say, IntervalTimerTest?) and enclose all timer related stuff there.

I added a test fixture 'IntervalTimerTest?' (partially copied from ASIOLinkTest). All of all timer related stuff are now in it.

  • there's no need to construct an IntervalTimer::Callback object explicitly thanks to the magic of boost::function. e.g., instead of:
                    timer_.setupTimer(IntervalTimer::Callback(
                                          TimerCallBack(test_obj_)), 1);
    
    we could say:
                    timer_.setupTimer(TimerCallBack(test_obj_), 1);
    
    This will help make the code simpler and more readable.

I modified the code as you suggested. It is much easier to read.

  • ASIOLinkTest::startIntervalTimer()

OK. Almost all of IntervalTimers? are now allocated statically, except in
destructIntervalTimer. In the test code we have to delete it.
I enclosed it by ASSERT_NO_THROW, is that OK?

It doesn't address the concern about the exception safety. It's not
about the case where newing the timer throws an exception. It's about
an exception triggered after the new before the test terminates. One
way to make it 100% correct would be to use a smart pointer for the
itimer, but that would be probably too much in this context; if the
test throws unexpectedly, the whole program would soon exit anyway, so
the perfect exception safety wouldn't be worth the code complexity.
After making the itimers real object whenever possible, my suggestion
is to ignore the exception safety problem with an explicit comment
like: "this code isn't exception safe, but we'd rather keep the code
simpler and more readable as this is only for tests and if it throws
the program would immediately terminate anyway".

BTW, whether or not adopting this suggestion, these _NO_THROW()s don't
make sense anyway. I'd remove them.

I removed ASSERT_NO_THROW and added comment as you suggested.

  • the additional tests introduce another 10 seconds of delay. in general, imposing a long delay in unit tests is not a good practice as unit tests are supposed to be performed "instantly". These types of delay have been introduced in our tests already (some of them were introduced by myself), and so it's not specific to the timer tests, but 10 seconds still look quite substantial. I'd not necessarily request it be solved right now, but the timer should eventually have finer granularity for the timer duration and use shorter timer periods for tests. For the moment I suggest adding some notes about this point in comments.

I noted these tests takes long time and this issue should be addressed in future.

statistics_unittest.cc

  • in QueryCountersTest?() verbose_mode_ should be initialized to false explicitly. Otherwise, it could cause noisy log messages during tests depending on the environment (it happened to me).

OK. verbose_mode_ is initialized to false.

  • incrementUDPCounter: I'd test if the counter is initially 0, and if it's really incremented by one after counters.inc(). Same for incrementTCPCounter, AuthSrvTest::queryCounter{UDP,TCP}.

I added some EXPECTs to these tests to check the counter is initially 0, and the value is 1 after counters.inc().

ChangeLog?

  • I couldn't play with the statistics using bindctl just from this description. After understanding how it works, I'd suggest revising the text as follows:
      TBD.  [func]		y-aharen
    	src/bin/auth: Added a feature to count queries and send counter
    	values to b10-stats periodically. To support it, added wrapping
    	class of asio::deadline_timer to use as interval timer.
    	The counters can be seen using the "Stats show" command from
    	bindctl.  The result would look like:
    	  ... "auth.queries.tcp": 1, "auth.queries.udp": 1 ...
    	Using the "Auth sendstats" command you can make b10-auth send
    	the counters to b10-stats immediately.
    	(Trac #347, svn rxxxx)
    
    (I used b10-stats in this context instead of generic "statistics module" because it would be more helpful for the users)

OK, I changed the text as you suggested.

Thank you again for your review.

comment:14 in reply to: ↑ 13 ; follow-up: Changed 9 years ago by jinmei

  • Owner changed from jinmei to y-aharen

Replying to y-aharen:

The latest code looks almost ready for merge. I've replied to some selected points that may not be trivial. I also added some new (not so substantial) comments. Other points are okay and simply ommitted.

  • IntervalTimer? destructor: my previous question doesn't seem to be addressed in the documentation: "what happens if we destruct IntervalTimer before starting a timer?"

If we destruct it before starting a timer, the callback function should not be called. I wrote a test.

I intended to ask to add a note in the documentation (which is not done yet, right?), but of course adding a test is good, too. Regarding the test, however, I'm not sure if it tests the right thing. I guess the new test for this is deleteIntervalTimerBeforeStart, but since it doesn't start the io_service_, it should be obvious that timer_called is unchanged. And, I guess the destructIntervalTimer actually tests what we'd want to test in this context.

  • setupTimer(): like the constructor, asio::system_error shouldn't be exposed.

I wrote try-catch to convert asio::system_error to isc::Unexpected and added documentation. It is hard to modify the library to throw an exception, so I didn't wrote a test.

Okay.

asio_link.cc

  • IntervalTimerImpl::callback()

OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
its arguments.

I guess you mean "no longer exits (or asserts)"? BTW with the check
in setupTimer() I'm okay with the assert() (or may even be a good
practice, but it's up to you).

It means I removed assertion from IntervalTimerImpl::callback(). I think throwing an exception is suitable for parameter check, so the code throws an exception.

Throwing an exception in setupTimer() is fine. I meant the assert() in callback(), and now the setupTimer() ensures the parameters are valid and it's the only place they can be set, we can safely assert() the fact here. But this is just a comment; whether to re-introduce the assert() is up to you.

asio_link_unittest.cc

  • the additional tests introduce another 10 seconds of delay. in general, imposing a long delay in unit tests is not a good practice as unit tests are supposed to be performed "instantly". These types of delay have been introduced in our tests already (some of them were introduced by myself), and so it's not specific to the timer tests, but 10 seconds still look quite substantial. I'd not necessarily request it be solved right now, but the timer should eventually have finer granularity for the timer duration and use shorter timer periods for tests. For the moment I suggest adding some notes about this point in comments.

I noted these tests takes long time and this issue should be addressed in future.

Okay. Could you open a separate ticket about this so that we can keep truck of it?

additional comments

  • as usual, I've made some minor editorial suggestions (r3977)
  • as for QueryType?, I guess it's now better be named (e.g.) CounterType? now that the class name is not specific to queries.
  • likewise, s/COUNTER_UDP/COUNTER_UDP_QUERY/? (imagine we'll want to define a counter type for e.g. "UDP notify")
  • in asio_link.cc, this should (better) be "const asio::system_error&":
    +    } catch (asio::system_error& e) {
    
  • in the following part of auth_srv_unittest.cc, s/IPPROTO_IP/Unexpected/?
    +// Submit unexpected type of query and check it throws IPPROTO_IP
    

comment:15 in reply to: ↑ 14 ; follow-up: Changed 9 years ago by y-aharen

  • Owner changed from y-aharen to jinmei

Jinmei-san, thank you very much for you review.
Committed to branch r4015.

Replying to jinmei:

  • IntervalTimer? destructor: my previous question doesn't seem to be addressed in the documentation: "what happens if we destruct IntervalTimer before starting a timer?"

If we destruct it before starting a timer, the callback function should not be called. I wrote a test.

I intended to ask to add a note in the documentation (which is not done yet, right?), but of course adding a test is good, too. Regarding the test, however, I'm not sure if it tests the right thing. I guess the new test for this is deleteIntervalTimerBeforeStart, but since it doesn't start the io_service_, it should be obvious that timer_called is unchanged. And, I guess the destructIntervalTimer actually tests what we'd want to test in this context.

OK. I added documentation. The test seems to be unnecessary because the result is obvious, I removed it.

asio_link.cc

  • IntervalTimerImpl::callback()

OK. IntervalTimerImpl::callback() no longer exists. setupTimer() checks
its arguments.

I guess you mean "no longer exits (or asserts)"? BTW with the check
in setupTimer() I'm okay with the assert() (or may even be a good
practice, but it's up to you).

It means I removed assertion from IntervalTimerImpl::callback(). I think throwing an exception is suitable for parameter check, so the code throws an exception.

Throwing an exception in setupTimer() is fine. I meant the assert() in callback(), and now the setupTimer() ensures the parameters are valid and it's the only place they can be set, we can safely assert() the fact here. But this is just a comment; whether to re-introduce the assert() is up to you.

I see. I think it is not necessary to assert because the parameters are checked in setupTimer().

asio_link_unittest.cc

  • the additional tests introduce another 10 seconds of delay. in general, imposing a long delay in unit tests is not a good practice as unit tests are supposed to be performed "instantly". These types of delay have been introduced in our tests already (some of them were introduced by myself), and so it's not specific to the timer tests, but 10 seconds still look quite substantial. I'd not necessarily request it be solved right now, but the timer should eventually have finer granularity for the timer duration and use shorter timer periods for tests. For the moment I suggest adding some notes about this point in comments.

I noted these tests takes long time and this issue should be addressed in future.

Okay. Could you open a separate ticket about this so that we can keep truck of it?

OK. I will open a ticket later.

additional comments

  • as usual, I've made some minor editorial suggestions (r3977)

I've checked it and merged them. thank you.

  • as for QueryType?, I guess it's now better be named (e.g.) CounterType? now that the class name is not specific to queries.
  • likewise, s/COUNTER_UDP/COUNTER_UDP_QUERY/? (imagine we'll want to define a counter type for e.g. "UDP notify")

OK. I renamed them.

  • in asio_link.cc, this should (better) be "const asio::system_error&":
    +    } catch (asio::system_error& e) {
    

OK. I modified as you suggested.

  • in the following part of auth_srv_unittest.cc, s/IPPROTO_IP/Unexpected/?
    +// Submit unexpected type of query and check it throws IPPROTO_IP
    

Yes, it is isc::Unexpected, not IPPROTO_IP.

If there is no problem, I will merge the branch into trunk.

comment:16 follow-up: Changed 9 years ago by y-aharen

There's huge difference between trunk and branches/trac347, I merged the branch with trunk and committed r4018. Could you please check it?

comment:17 in reply to: ↑ 15 ; follow-up: Changed 9 years ago by jinmei

  • Owner changed from jinmei to y-aharen

Replying to y-aharen:

asio_link.cc

  • IntervalTimerImpl::callback()

Throwing an exception in setupTimer() is fine. I meant the assert() in callback(), and now the setupTimer() ensures the parameters are valid and it's the only place they can be set, we can safely assert() the fact here. But this is just a comment; whether to re-introduce the assert() is up to you.

I see. I think it is not necessary to assert because the parameters are checked in setupTimer().

It's up to you, but the reason doesn't make sense to me. With that logic we'd never need any assert(). assert() is to perform a run time check for something "that should never happen because we should have checked the parameters/validated the input/not modified the data, etc" in the first place. But in reality humans are not perfect and make mistakes, and anything that should never happen will often happen. That's why we need assert().

To be clear, again, it's up to you. After all, no one will do assert() in the obvious code like this:

    int i = 0;
    ++i;
    assert(i == 1);

The real cases are more subtle, of course, and different developers have different senses of "what's obvious". Asynchronous code logic is often tricky and tends to cause surprising bugs, which is why I thought assert()ing it may make sense. But if you think the correctness of the current code is sufficiently obvious, feel free to leave the code without the assert().

If there is no problem, I will merge the branch into trunk.

I'm okay with the branch. Please feel free to merge it.

comment:18 in reply to: ↑ 16 Changed 9 years ago by jinmei

Replying to y-aharen:

There's huge difference between trunk and branches/trac347, I merged the branch with trunk and committed r4018. Could you please check it?

I'm afraid it's beyond the duty of code reviewer, especially when that person is absurdly busy:-)

From a very quick browse of the diff I didn't find anything obviuosly wrong.

In general, if the resulting code compiles and all tests pass, it should be okay unless you resolve possible conflicts in an evil way (e.g., removing a test that fails). Our detailed unit tests should do a good job to detect regression.

comment:19 in reply to: ↑ 17 Changed 9 years ago by y-aharen

  • Resolution set to complete
  • Status changed from reviewing to closed

Replying to jinmei:

Replying to y-aharen:

asio_link.cc

  • IntervalTimerImpl::callback()

Throwing an exception in setupTimer() is fine. I meant the assert() in callback(), and now the setupTimer() ensures the parameters are valid and it's the only place they can be set, we can safely assert() the fact here. But this is just a comment; whether to re-introduce the assert() is up to you.

I see. I think it is not necessary to assert because the parameters are checked in setupTimer().

It's up to you, but the reason doesn't make sense to me. With that logic we'd never need any assert(). assert() is to perform a run time check for something "that should never happen because we should have checked the parameters/validated the input/not modified the data, etc" in the first place. But in reality humans are not perfect and make mistakes, and anything that should never happen will often happen. That's why we need assert().

To be clear, again, it's up to you. After all, no one will do assert() in the obvious code like this:

    int i = 0;
    ++i;
    assert(i == 1);

The real cases are more subtle, of course, and different developers have different senses of "what's obvious". Asynchronous code logic is often tricky and tends to cause surprising bugs, which is why I thought assert()ing it may make sense. But if you think the correctness of the current code is sufficiently obvious, feel free to leave the code without the assert().

OK, I leave the code without the assert().

If there is no problem, I will merge the branch into trunk.

I'm okay with the branch. Please feel free to merge it.

Merged to trunk at r4026. Thanks for your review.

I created 452 to address long delay in IntervalTimerTest?.

Note: See TracTickets for help on using tickets.