Opened 9 years ago

Closed 8 years ago

Last modified 8 years ago

#374 closed enhancement (invalid)

Writable datasources: Basic support for writables in C++ datasrc library

Reported by: jelte Owned by: jelte
Priority: medium Milestone:
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 0.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

And implement it in the sqlite data source.

(This is a subset of ticket 232, which is too general)

Support for writable data sources in the C++ API, containing at least the following items:

  • Transactions
  • Adding and removing RRsets
  • Adding and removing zones
  • Replacing the contents of a zone completely (for loadzone and axfr)
  • IXFR
  • Dynamic update

Subtickets

Change History (17)

comment:1 follow-up: Changed 9 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from new to reviewing

Changes can be found by diffing r3257 to r3309 in branches/trac374
Alternatively, you can do

svn merge --reintegrate svn+ssh://bind10.isc.org/svn/bind10/branches/trac374

It should still merge cleanly.

Notes for review:

Since this is only the c++ part, and our XFR and loadzone functionality (i.e. the external interfaces to actually use this), actual data source changes are only done in the unit tests (i.e. you can't actually play with your own zones in this branch). See the upcoming branches for #375 and #376 for that.

I'd start out reading the (added) documentation in data_source.h, specifically the DataSrcTransaction? class (which is a general data placeholder for whatever any specific datasource implementation needs to perform transactions, and makes sure the transaction is finished later).

The design of the interface is generally based on the existing datasource one, so there's low-level functions that must be overridden (addRRset etc.), and high-level ones that work on DNS Message objects, that could be overridden but don't have to be. We might find that these need to be moved up out of the Data Source API, but for now I decided to keep them there, to keep it close to the query logic.

One 'open' issue is that most of the current code works extensively with return codes, and not so much with exceptions. Especially in functions like doIXFR this means a lot of if (success) statements. This may need to be reconsidered.

comment:2 in reply to: ↑ 1 Changed 9 years ago by jinmei

One quick feedback: it didn't compile for me:

/opt/local/include/gtest/gtest.h: In function ‘testing::AssertionResult testing::internal::CmpHelperEQ(const char*, const char*, const T1&, const T2&) [with T1 = isc::datasrc::AbstractDataSrc::Result, T2 = isc::datasrc::AbstractDataSrc::WriteResult]’:
/opt/local/include/gtest/gtest.h:1300:   instantiated from ‘static testing::AssertionResult testing::internal::EqHelper<lhs_is_null_literal>::Compare(const char*, const char*, const T1&, const T2&) [with T1 = isc::datasrc::AbstractDataSrc::Result, T2 = isc::datasrc::AbstractDataSrc::WriteResult, bool lhs_is_null_literal = false]’
sqlite3_unittest.cc:1647:   instantiated from here
/opt/local/include/gtest/gtest.h:1263: warning: comparison between ‘const enum isc::datasrc::AbstractDataSrc::Result’ and ‘const enum isc::datasrc::AbstractDataSrc::WriteResult’

There were many similar warnings (treated as an error due to -Werror).

If I made a trivial change of replacing DataSrc::SUCCESS with DataSrc::W_SUCCESS to the offending code:

    EXPECT_EQ(DataSrc::SUCCESS,
              data_source.doIXFR(transaction2,
                                 isc::datasrc::callbackHelperRRsetIterator,
                                 &begin,
                                 &end)
    );

it passed, but I don't know if that was the original intention.

comment:3 Changed 9 years ago by jelte

Oh, yes, I changed the result value of doIXFR to be DataSrc::WriteResult?, not Result. Apparently I forgot to update that test (and apparently my compiler doesn't seem to mind :/).

W_SUCCESS is what should be there.

comment:4 Changed 9 years ago by jinmei

Review is still ongoing, but in case Jelte has spare time or needs to
kill night time due to jetlag, I'm providing partial feedback.

general design

I personally still believe it's better to move DNS protocol specific
behavior out of the data source module as you perhaps felt. I've
commented on some specific points in this context below, too. But
assuming this is an initial attempt that will be revisited, I'm okay
with completing this work without touching this level of design issue.

Likewise, I still believe the current class hierarchy is not well
defined. The fact that many derived methods end up just returning
'not implemented' means (IMO) the relationship between the base and
derived classes is not a reasonable "is-a" relationship. But, again,
I'm okay with completing this work without touching this level of
topic.

general comments about the code

  • in general it would be better to pass reference instead of a (shared) pointer if it cannot be NULL in that context. this is good in that we wouldn't worry about the (buggy) case where a NULL is passed by accident, and in case it's a boost shared pointer used for a public method, it's good in that we can reduce dependency on boost things. I've commented on this topic in some specific points below, too.
  • there are some too long lines. if we follow BIND 9's guideline, it would be better to fold lines so that they don't exceed 79 characters.
  • it seems many variables could be defined as const. I'm pointing out specific cases as I add comments below (and probably in subsequent review comments), but it's quite likely to be incomplete, so it would be nice if you can perform double check.

general comments about protocol processing for dynamic update and IXFR

In general, it seems the current implementation needs more checks to
make it protocol compliant (see specific comments for some specific
issues). It's also not clear how much of the protocol processing
should be done in the C++ code, assuming we'll write python daemons
for these protocols.

If we can cleanly separate dynamic update and IXFR part of the code,
maybe we should merge the general part (possibly with the part
necessary for AXFR and zone loading) first, and then revisit the
protocol processing part with more unit tests (or even reconsider
whether to provide C++ code for them). Or, perhaps we could integrate
these parts as a "work in progress" item for now (after review, of
course) and not actually use them in our server programs, and
revisit/redesign them later.

data_source.cc

  • equalRRsets()
    • using keyword 'static' to indicate a function to be file local is obsolete in C++. the more 'politically correct' way is to define an unnamed namespace:
      namespace {
      bool
      equalRRsets(ConstRRsetPtr a, ConstRRsetPtr b) {
      ...
      }
      }
      
    • since 'a' and 'b' are assumed to be non null (shared) pointers, I'd declare them as 'const RRset&' (and adjust the caller side accordingly). Then we wouldn't have to worry about a buggy case where it's a null.
    • I'm afraid the current implementation of 'equality' check is incomplete. Consider the following corner cases (some of which may not be so rare)
      • RRsets with different TTLs
      • there's a duplicate RDATA (e.g. there are two A RRsets, one contains a single '192.0.2.1', and the other contains two '192.0.2.1's.
      • RDATA is equal as a set, but the ordering is different.
      FYI, BIND 9 takes care of the ordering issue. ignoring TTLs is probably okay in this context (and I know it's even 'necessary') especially because it's a private helper function. but I'd add an explicit note that TTL is ignored in the definition of 'equal'.
  • haveRRset()
    • is it okay to assume the IN class for ANY/NONE?
    • as a public method, the logic is too dynamic update-specific and counter intuitive. this could actually be non public, non method function?
    • since (it seems) 'rrset' couldn't be a null (shared) pointer, I'd declare it as a 'const RRset&'. See the comment for equalRRsets(). and , this would be especially good for public interfaces because it makes it less dependent on boost definitions.
  • updateCheckPrerequisite()
    • I'd declare prereq as 'const RRset&' (see above)
    • what about case RRSIG? (shouldn't we cover type-covered? FYI, BIND9 does something special in case of updating RRSIG)
    • does this code cover empty non-terminal cases?
  • updateProcessUpdate()
    • there's indentation inconsistency for the line of the second parameter ('update')
    • I'd declare update as 'const RRset&' (see above)
    • I'm not so sure if we can omit pre-scan for in memory data source.
    • (comment) RDLEN check must be in dns::Message (otherwise it would fail to parse the request), but double check is probably necessary, too
    • indent for the 'update' parameter
    • update_type should better be defined as const
    • this condition is insufficient (NONE should also be checked)
          if (update->getClass() != RRClass::ANY()) {
      
    • meta RR types should be checked for the case of class = NONE
    • about "other special handling", we should probably refer to BIND 9. It performs more checks. (bind9/bin/named/update.c:update_action())
  • callbackHelperRRsetIterator()
    • many questions including the design, but anyway...
    • is there any specific reason for using the postfix form of ++ (which is inefficient)?
              (*cur)++;
      
    • can cur/end be validly NULL? if there's no valid scenario for that, I'd rather throw an exception or assert() it.
  • callbackHelperRRsetVector()
    • I'd not rely on implicit conversion from a pointer to bool...
          if (v && i && *i < v->size()) {
      
  • ...but, more fundamentally, I'd also question whether we ever allow NULL for 'v' or 'i' (see also a similar comment on callbackHelperRRsetIterator) or out-of-range index for 'v' (except, perhaps for the end of the vector)
  • doIXFR()
    • I'd consider a state mismatch an exception:
          if (transaction.getState() != DataSrcTransaction::RUNNING) {
      
    • I'd defer the declaration of 'result' until around here:
          RRsetPtr next_rrset, last_rrset;
      
    • final_soa/first_soa could/should be of ConstRRsetPtr. Same for next_rrset and last_rrset. Furthermore, final and first could even be of 'const ConstRRsetPtr'.
    • this should be !first_soa:
          RRsetPtr first_soa = nextRRset(arg1, arg2);
          if (!final_soa) {
      

(btw: this might suggest unit tests are not sufficient.)

  • (a high level comment) this code assumes if the first two RRs are of SOA it's IXFR, but, technically, it's not really correct because an AXFR could also consist of just two SOA RRs (although, of course, the resulting zone would become bogus as it wouldn't have an NS). For that matter, I'm not really sure how much of this level of protocol processing is assumed to be done in the C++ code. Assuming most of protocol level consideration for XFRs should go to python, ideally we should be able to concentrate on operations at the level of data source abstraction, i.e., simply adding/deleting DB rows or replacing a table, without worrying about their DNS protocol implications.
  • (aside from the high level comment) I'd treat protocol-level errors such incorrect sequence of SOAs as an exception, or at least as a different type of error than W_xxx (because these errors are not really a data source level error).
  • the first while loop seems to be a duplicate of the internal of Sqlite3DataSrc::replaceZone(). Can't we unify the code?
  • this comment is incomplete?
            // We don't delete the actual SOA itself,
    
  • I'm not sure if this behavior is reasonable:
                    // check if rrset exists, if not, something is very wrong, abort
    
    RFC1995 doesn't seem to talk about this case, and FYI BIND 9 is generous about such a case:
    			} else if (result == DNS_R_UNCHANGED) {
    				/*
    				 [...]
    				 * It may happen when receiving an IXFR
    				 * from a server that is not as careful.
    				 * Issue a warning and continue.
    				 */
    
  • return value is ignored here:
            addRRset(transaction, final_soa);
            if (result != W_SUCCESS) {
    
  • the final return seems to be redundant:
        return (W_NOT_IMPLEMENTED);
    
  • doUpdate()
    • a high level comment: like doIXFR(), I'd wonder how much of protocol processing should be in the C++ lib.
    • shouldn't we care about state mismatch like doIXFR()? (interestingly, doIXFR() doesn't care about the case of Opcode mismatch. so the Opcode version of this question should go to doIXFR(). and, I'd treat an Opcode mismatch as an exception, too)
    • 'question' could/should be of 'const ConstQuestionPtr?'.
    • I don't understand the comment 'should we do transaction here?'
    • (aside from the high level comment above) I'd treat protocol-level errors such as zone name mismatch as an exception a different type of error than W_xxx (see the same comment to doIXFR()).
    • couldn't/shouldn't we use the postfix form of ++?
               it++) {
      
      (I'd personally rather use an STL algorithm or boost foreach, but wouldn't insist on it)
    • cur_prereq could/should be ConstRRsetPtr, or could even be omitted:
              Rcode prereq_result = updateCheckPrerequisite(transaction, *it);
      
      (but I'd even go further: passing a 'const RRset&' instead of a shared pointer. see the comment for updateCheckPrerequisite())
    • is it okay to clear the message in case of error? i.e. shouldn't we use makeResponse()?
    • same comments apply to the update section. in addition, why don't we update 'msg' in case of error?
    • (comment) BIND 9 doesn't seem to do anything with the additional section.
  • ~DataSrcTransaction?()
    • we don't have to call getState() (although the overhead would be very marginal)
    • I think we should be more careful about cases where rollbackTransaction() fails. When that happens it means something catastrophic, right? So, IMO we should have rollbackTransaction() report the type of error in as detailed as possible (whether it's via a return code or an exception), and deliver its implication to the user at all cost (for example, by pre-allocating necessary resources to log this event without triggering an exception). Also in this context, I'd request rollbackTransaction() to limit the possible exceptions within it so that this destructor can catch any possible exceptions more selectively (in fact, as far as I can see Sqlite3DataSrc::rollbackTransaction() should never throw an exception). (btw: in case I'm not clear I understand we cannot let an exception be propagated from a destructor)

(to be continued)

comment:5 Changed 9 years ago by jinmei

continuing review...still ongoing.

data_source.h

  • I made some trivial editorial changes to the branch (r3500)
  • you don't have to include dns/message.h. it should just suffice to add a forward declaration of Message in the isc::dns name space.
  • it's mostly the same for rrset.h, if we change ConstRRsetPtr to const RRset& where possible as I suggested above. The only tricky part is the return value of callbackHelper*(), but I guess it could be a bear pointer in this context
  • callbackHelperXXX(): I wonder whether it might make sense to do it in more C++-ish way. For example, we might define the "RRsetEnumerator" class, whose operator() returns the "next" RRset(ptr). We also define derived classes IteratorRRsetEnumerator, VectorRRsetEnumerator, etc, depending on the internal container. Then doIXFR() would look like:
    DataSrc::WriteResult
    DataSrc::doIXFR(DataSrcTransaction& transaction,
                    RRsetEnumerator& enumerator)
    {
        ...
        ConstRRsetPtr final_soa = enumerator();
        if (!final_soa) {
            return (W_ERROR);
        }
        ConstRRsetPtr first_soa = enumerator();
        if (!first_soa) {
            return (W_ERROR);
        }
        ...
    }
    
    This is not only about a matter of style, but also about more practical advantages. That is, this way we can make it type safer by not relying on passing around void* values and cast them to particular type of values. Also, with this approach we can hide derived classes like "VectorRRsetEnumerator" in the test code (it seems to me this one is introduced for the convenience of tests; in practice we'd normally if not always use RRsetIterator).

Alternatively we could make doIXFR(), etc templated so that it can
take generic iterators. It might be more C++-ish, although we
couldn't do that in this current design because doIXFR() is a
virtual function.

In either case, for a vector based interface I'd use the begin and
end iterators instead of an index.

  • AbstractDataSrc::WriteResult? I'd make the comment for each enum value a doxygen description (using '<')
  • DataSrcTransaction?
    • I'm not sure if it makes sense to attribute "zone name/class" to this class, because a transaction in a data source may not always be specific to a single zone. for example, we may want to add more than one zones to a data source as an atomic transaction. Also, even for the case where we add a single zone (which is not possible through the current public interface though), since the zone information is an initial attribute of DataSrcTransaction?, it would look like
          data_source.addZone(trans);
      
      this is a bit awkward in terms of interface consistency because when we add an RRset we pass the RRset(ptr) to be added.
  • so, I wonder whether we might make this class is a smaller primitive: just holding a scoped transaction for a data source, similar to the Boost scoped_lock. Then the code would look like as follows:
        void doIXFROrSomething(data_source) {
            DataSrcTransaction trans(data_source);
    	//
            // do whatever necessary within the transaction
    	//
    	trans.commit();
        }
    
    Below, however, I'll continue the review assuming the current interface for now.
  • member variable names shouldn't begin with "_" (which is (maybe by convention) reserved for standard definitions, in my understanding)
  • it may be better to provide doxygen description for states? and, according to the guideline, "states" should probably better be "States".
  • constructor: 'explicit' is not necessary for multi-parameter constructor.
  • destructor: since the behavior of the destructor may be critical (see the comment to its implementation), I think we should add some note in the doxygen style.
  • the getters should better be member const functions.
  • 'const' to the returned object type is redundant (getZoneName() and getZoneClass())
  • I'd define _zone_name and _zone_class as const.
  • I'm not sure why we pass a pointer of data source. can it be NULL?
  • getZoneName() could return a reference to _zone_name, which would be a bit more efficient.
  • I'd feel nervous about the setters (setState() and setData()). Due to them _state and _data are effectively public member variables, and make the class quite fragile to careless (buggy) usage. for example, a buggy implementation could set the state from "RUNNING" to "INIT" (or to "DONE" without commit or rollback), which would cause something very weird. Likewise, an application may modify or even reset to NULL the internal data. Since the Sqlite3DataSrc implementation assumes it's always a valid pointer containing expected data, it would make the application crash. We could blame the application saying it's "their bug", but it would be much better to prohibit such risky usage via the interface.
  • A simpler "scoped transaction" would be one possibility. Another is to introduce a class hierarchy of transactions from a base abstract class to a specific derived class per data source. With this approach, we define e.g. Sqlite3DataSrcTransaction derived from base DataSrcTransaction? and have an instance of Sqlite3DataSrc create an associated transaction:
        DataSrcTransaction* trans = data_source.createTransaction();
        // here, if data_source is of Sqlite3DataSrc, trans is of
        // Sqlite3DataSrcTransaction.
    
        data_source.addRRet(*trans, rrset); // or something like this
    
    Then, we can hide the interface and implementation within the definition of Sqlite3DataSrcTransaction.
  • getDataSource() doesn't seem to be used. since it's a mutable "handle", it would be better to not to disclose this information until we see the real need for it, that is, I suggest removing this method for now.
  • DataSrc class: the above design-level comment would also affect the modifications to DataSrc, but here I'll review it as is.
    • in general, I'd like to know whether/when the methods throw an exception.
    • addRRset(): '\return' is missing, in which W_SUCCESS should be. I'd be interested in what happens if the same <name,type> of rrset already exists.
    • delRRset(): \return is missing. what if no such RRset exists?
    • delZone(): there should be a period at the end of \brief line. otherwise doxygen considers the next line a part of the brief description. what if no such zone exists?

(to be continued)

comment:6 Changed 9 years ago by jinmei

continuing review...still ongoing.

data_source.h

  • (forgot to point this out) in the declaration of DataSrc::startTransaction(), if you specify 'const' for 'create_zone', you'll have to be consistent about that in its definition (and possibly the declaration and definition of derived versions) due to a deviant (or buggy) behavior of sunstudio.

sqlite3_datasrc.cc

  • boost/foreach.hpp doesn't seem to have to be included.
  • in ~Sqlite3Initializer(), shouldn't we handle w_add_all_, etc, too? and, this seems to indicate that it's time to introduce some refactoring to make the code robust.
  • Sqlite3DataSrc::open(): I'd make sq_result 'const int'. I'd fold the isc_throw() line for readability.
  • Sqlite3DataSrc::close(): w_del_all_ and w_del_zone_ are not cleared. perhaps another indication of the need for refactoring (see above).
  • Sqlite3DataSrc::startTransaction()
    • I'd throw an exception if state != INIT as it should be a program error at the caller side. same for commit/rollback.
    • we could store the result of transaction.getZoneName().toText() so that we can reuse it in the new zone case. just a comment.
    • shouldn't we expect zone_id == 0 as a valid case?
              if (zone_id <= 0) {
      
    • why do we assume the RRClass is IN?
                      DataSrc::WriteResult add_zone_res =
                          addZone(transaction.getZoneName(),
                                  isc::dns::RRClass::IN());
      
    • I'd make sure zone_id is valid after creation:
                      zone_id = hasExactZone(transaction.getZoneName().toText().c_str());
                      assert(zone_id >= 0);
      
    • how could the rollback fail in this context? depending on the reason, it may be better to handle it differently than returning the generic error code, e.g. throw an exception.
    • I'd avoid hardcoding keyword "zone_id" for trans_data. same for other part of the code that uses this keyword.
  • Sqlite3DataSrc::commitTransaction()
    • I'd declare 'result' as 'const int' or even omit the use of a temporary variable. same for rollback.
  • Sqlite3DataSrc::addZone()
    • in the first call to sqlite3_bind_text(), it seems that the last parameter can be SQLITE_STATIC. Also, if we use TRANSIENT, we can avoid having a temporary variable (s_name), although I wouldn't be so nervous about a "redundant" variable if it's a const.

Furthermore, the mixed usage of STATIC and TRANSIENT may confuse
code readers (and might trigger an accidental regression in future
by a careless developer). Ideally we should keep the consistent
style, or at least leave a comment about our policy of when to use
STATIC/TRANSIENT. (one possible approach is to use TRANSIENT
consistently in the "write" operations because its performance
impact should be relatively less important).

This comment applies to some of the other cases in other methods.

  • what should we do if the zone for the <name, rrclass> already exists?
  • I'd avoid using a temporary mutable variable 'rc' if we don't use it except for '!= SQLITE_OK'. same for 'rc' in other methods.
  • I'd declare 'result' as const int.
  • Sqlite3DataSrc::addRRset()
    • I'd throw an exception if getRdataCount == 0 as it should be a program error at the caller side.
    • zone_id could (should) be 'const int'
    • in this context we cannot naively assume getData() or get("zone_id") returns a valid value. (see also the higher level comment about the design)
  • Sqlite3DataSrc::delRR(without rdata)
    • looks like the third call to sqlite3_bind_text can fit in a single line:
              rc = sqlite3_bind_text(query, 3, "%", -1, SQLITE_STATIC);
      
    • is it okay to use sqlite3_total_changes() for this purpose? according to the sqlite3 manual, it's the number of changes "since the database connection was opened". btw, if my concern is real, that would suggest unit tests may not be sufficient.
    • I'd declare 'deleted_rows' as 'const int'.
    • not specific to this method, but as I've seen so many repetitive patterns around sqlite3_bind_text, I wonder we might unify this pattern to avoid repeating the duplicate logic. e.g, we might define something like
          void bindName(sqlite3_stmt* query, const string& name_str, const int column) {
              if (sqlite3_bind_text(query, column, name_str.c_str(), -1,
                                    SQLITE_STATIC) != SQLITE_OK) {
                  isc_throw(...);
              }        
          }
          void bindName(sqlite3_stmt* query, const char* const name_cstr, const int column) {
              if (sqlite3_bind_text(query, column, name_cstr, -1,
                                    SQLITE_TRANSIENT) != SQLITE_OK) {
                  isc_throw(...);
              }        
          }
      
      then we can simplify each method like as follows:
          ...
          bindZone(query, 1, zone_id)
          bindName(query, 2, name.toText().c_str());
          bindRRType(query, 3, RRType::ANY() ? "%" : rrtype.toText().c_str());
          ...
      
      just a thought.
  • Sqlite3DataSrc::delRR(with rdata)
    • most part of the code is a duplicate of the non rdata version. smells like the need for refactoring.
  • Sqlite3DataSrc::delAll()
    • at the end of the method, for such a simple if-else pattern I'd use ?: to keep the size of the method shorter:
          return (result == SQLITE_DONE ? W_SUCCESS : W_DB_ERROR);
      
      but I understand this is largely a matter of preference. it's just a comment. oh, btw, in the scope of Sqlite3DataSrc you can ommit "DataSrc::".
  • Sqlite3DataSrc::replaceZone()
    • I'd thrown an exception if state != RUNNING.
    • getData() and get("zone_id") may return an invalid value.
    • is there a valid scenario that we see NULL for nextRRset? if not, I'd rather throw an exception...hmm, DataSrc::doIXFR() has that usage, but in data_source.h it's not clear what we'd expect when nextRRset is NULL.
    • I'd declare next_rrset as ConstRRsetPtr.
  • Sqlite3DataSrc::delZone()
    • I'd thrown an exception if state != RUNNING.
    • getData() and get("zone_id") may return an invalid value.
    • hmm, I now realize this method (and most if not all of the writable variants of the methods) itself doesn't provide strong exception/failure guarantee, i.e, when something bad happens the DB may stay in a strange state. it should be okay because it's assumed to be used with a transaction context, but I think it's better to note that fact explicitly. (see also below for the relevant comment about sqlite3_datasrc.h)
  • Sqlite3DataSrc::delRRset()
    • same usual comments as delZone()

sqlite3_datasrc.h

  • we should be able to omit including most of the added .h's. as far as I can see the only necessary header file is rrset.h for RRsetPtr. for others more lightweight forward declaration should suffice.
  • The public methods do not simply follow the interface defined in the base class, but have some sqlite3 specific behavior (especially about when to throw exceptions). So I think we should provide documentation for these methods focusing on the specific behavior while referring to the base class.
  • indentation isn't well aligned for delRR() (both of the two variants).

(to be continued)

comment:7 Changed 9 years ago by jinmei

continuing review...this is the last chunk of comments.

sqlite3_unittest.cc

  • I made some trivial editorial changes to the branch (r3509)
  • general comments
    • there are mixture of ASSERT_EQ() and EXPECT_EQ(), and some of them don't seem to have to be ASSERT. did you intentionally use ASSERT_EQ() where it's used?
    • according to lcov there are some parts that are not tested, including:
      • some of the "higher level" methods are not tested directly.
      • some FORMERR cases in updateCheckPrerequisite()
      • some error cases in updateProcessUpdate()
      • same for doIXFR()
    • overall, the newly added tests rather seem to belong to the higher level layer. I didn't see much (if any) part in the tests that's specific to sqlite3. also, if we test them at the higher level using some mock data source, it will be easier to test cases with DB errors.
  • SQLITE_DBFILE_WRITE could (should) be of ConstElementPtr?.
  • zone_name could (should) be of const Name.
  • zone_name is doubly defined (in the scope of Sqlite3DataSourceTest.) is that intentional?
  • do we need to copy the writable data source file to the build dir in the test? ... ah, I see we need to do that to "reset the DB". actually, in that case, I'd introduce a separate class, always rest the DB in each test's initialization phase (constructor or SetUp?()). see also the next bullet and the relevant comment on the delRRset test below.
  • install_writable_database() should be named installWritableDatabase?
  • many of the new tests first switch to the writable data source. maybe we should introduce a separate test class
  • Transactions test
    • I don't understand this comment:
          // need rrsetit for doIXFR (but doIXFR should probably be moved up to datasource itself)
      
      (btw it's too long). maybe it's relevant to rrset_it defined above (note that it's not "rrsetit"), which is unused in the test.
  • delRRset test
    • I'm afraid it's not a good idea to rely on a side effect

of another test. for example, we cannot selectively run this test:

% ./run_unittests --gtest_filter=Sqlite3DataSourceTest.delRRset
(this fails)
  • these should be "delete it", and "add it", resp:
        // add it, but roll back
        ...
        // add it, and commit
    
  • getRRsetCallback_empty() should be named getRRsetCallbackEmpty?
  • replaceZone_callback_empty test
    • a very minor nit, but considering the test name is a function name, this should probably be replaceZoneCallbackEmpty? same for other test names.
  • replaceZone_new_zone test
    • 'rrset' could (should) be a vector of ConstRRsetPtr.
    • initialization of soa_rrset could be a bit simplified:
          RRsetPtr soa_rrset(new RRset(Name("new.example.com"),
                                       RRClass::IN(),
                                       RRType::SOA(),
                                       RRTTL(3600)));
      
      same comment applies to some other similar cases, too.
    • namespace can be omitted for createRdata(). this will help prevent the parameter lines from being too long (damaging readability).
  • general comments about helper functions
    • I think it should be defined somewhere else (possibly with some generalization) so that we can share it in other test cases. we might also want to integrate this feature into lib/dns/tests/testdata/gen-wiredata.py
    • from this point of view, it's not very clear how to populate test data other than by reading short comments about rrsetsFromFile() and source code + test examples. we'll need more friendly description about how to use it. (admittedly, gen-wiredata.py is bad on this point, too, and dns/unittest_util is not very good either)
  • questionFromFile()
    • I believe the return value could/should be of ConstQuestionPtr?.
    • error cases after operator>> or getline() are ignored (this comment applies to other helper functions). Understanding this is just a helper function in tests, I could live with it. Personally, however, I'd still to catch error cases, because the input is handmade data and we sometimes make errors. If we still ignore it, please at least leave a comment explaining that's intentional (for simplicity, etc).
    • a specific related point. this code effectively works as EOF determination:
          string s;
          file >> s;
          if (s == "") {
              return (QuestionPtr());
          }
      
      but IMO it's fragile because it's implicit and some specific value in s.
    • with this parsing code we'd accept multi-line input like this
      example.com IN
      SOA
      
      Is that intentional? If so, is it okay to accept the following example?
      example.com IN
      SOA another.example IN A 192.0.2.1
      
      ("another.example..." will simply be ignored)
    • if multi-line input isn't expected, I'd suggest the following revise:
      • call readline() at the outer loop in rrsetsFromFile() and updateFromFile(), and pass the extracted to string a string version of rrsetFromFile() (say, rrsetFromString()):
           string line;
           while (getline(ifs, line), !ifs.eof()) {
             if (ifs.bad() || ifs.fail()) {
               //error handling
             }
             ConstRRsetPtr rrset = rrsetFromString(line);
        
      • in rrsetFromString() use istringstream to parse that line:
           rrsetFromFile(const string& line) {
               istringstream ss(line);
               string s;
               ss >> s;
               ...
           }
        
    • the call to getline() at the end of this function will result in ignoring erroneous input as a question like "example IN A 192.0.2.1" (which is valid for other sections). shouldn't we rather catch and reject that?
  • rrsetFromFile()
    • most of the comments on questionFromFile() apply here, too
    • initialization of rrset could be simplified:
          RRsetPtr rrset(new RRset(n, rrclass, rrtype, ttl));
      
    • I'd use stringbuf to get the "rest of the line" (skipping spaces):
        stringbuf sbuf;
        is >> &sbuf;
        if (ss.fail()) {
          // no additional input
        } else if (!ss.bad()) {
          rrset->addRdata(createRdata(rrtype, rrclass, sbuf.str()));
        }
      
    • the call to addRdata() could be simplified as shown in the above example (without using a temporary variable 'rdata')
  • rrsetsFromFile() (msg version)
    • the first block of comment seems to be out of sync. it doesn't set flags/codes; it doesn't only handle the answer section.
    • the code logic inside the while loop is difficult to understand to me due to multiple nested conditions with mutable variables. At least there are some obviously redundant code:
      • this test is unnecessary
                    if (rrset) {
        
        because at this point rrset must be non NULL.
      • updating prev_rrset can be unified:
                    prev_rrset = rrset;
                } else {
                    prev_rrset = rrset;
                }
        
      thereby eliminating an unnecessary else clause.

But, frankly, I'd be still not sure if the code is sufficiently
readable. I'd feel a bit more comfortable with the following code:

    while (! file.eof() ) {
        const RRsetPtr rrset = rrsetFromFile(file);
        const bool same_type =
            (rrset && prev_rrset &&
             prev_rrset->getName() == rrset->getName() &&
             prev_rrset->getType() == rrset->getType() &&
             prev_rrset->getType() != RRType::SOA() &&
             prev_rrset->getClass() == rrset->getClass());
        // if this is the end of the section or we've got a new type of RRset,
        // add the RRset we've constructed so far, if any.
        if (prev_rrset && (!rrset || !same_type)) {
            msg.addRRset(section, prev_rrset);
        }
        // if this is the end of the section we are done.
        if (!rrset) {
            return (1);
        }
        // update prev_rrset: if this is a new RR of the same type, merge it.
        // otherwise replace it.
        if (same_type) {
            RdataIteratorPtr it = rrset->getRdataIterator();
            for (it->first(); !it->isLast(); it->next()) {
                prev_rrset->addRdata(it->getCurrent());
            }
        } else {
            prev_rrset = rrset;
        }
    }

The point is to unify the point of doing addRRset(), and to make
rrset a const variable (note: not ConstRRsetPtr) so that prev_rrset
is the only mutable variable. But I must admit this code is not
obviously clearer than the original, so if you don't agree, please
ignore the suggestion and move forward with the trivial cleanup by
removing the redundancy.

(for that matter, we may need a "merge mode" for the RRset class or

introduce a notion of "RR" and provide a primitive of "addRR" for
usage like this. but that's a different topic)

  • as already noted, file.eof() never happens.
  • the intent of special case for SOA isn't so obvious. maybe we need some comment here.
  • rrsetsFromFile (ifstream + vector version)
    • most of the code is a duplicate of the other version. I see the need for refactoring. we could use a template with some helper class, or unify the main code in the vector version and have the msg version use it.
  • updateFromFile()
    • this code is not very readable to me, due to the tricky usage with stage (and the magic numbers for 'stage'). it looks like if we unified the interface of questionFromFile() with that of rrsetsFromFile(), the code would become more straightforward:
              while (! myfile.eof() ) {
      	    questionFromFile(myfile, m);
                  rrsetsFromFile(myfile, m, Section::ANSWER());
                  rrsetsFromFile(myfile, m, Section::AUTHORITY());
                  rrsetsFromFile(myfile, m, Section::ADDITIONAL());
              }
      
      (and I guess we could even share the code for questionFromFile and rrsetsFromFile, e.g. using a template).
  • checkSingleRRset()
    • text comparison is not really good (fragile to minor implementation changes). ideally, for example, we'd pass a vector of RDATA strings with name/type/class, then in checkSingleRRset() we'd construct another vector of RDATA from the retrieved RRset, sort both vectors and then compare them. We may also include such tests in a shared place such as unittest_util. But, perhaps we should live with the current tests within this ticket - it's already too big with too many review comments.
    • what's the purpose of commenting out a test?
          //EXPECT_EQ(DataSrc::SUCCESS, find_flags);
      
      if it's intentionally commented-out, please explain the intent; otherwise, please simply remove it.
  • rrsetsFromFile(): container could/should be a vector of ConstRRsetPtr.
  • dynamic_update_bad_class test
    • the comment says "no/bad question section", but doesn't it only test the case for "no question section"? or perhaps you intended to set a question section with the IN class?
  • dynamic_update_prereq_fails test
    • for "update4.packet", if we consider each RR (not RRset) a prerequisite, it's fifth prereq that fails. same comment applies to the subsequent updateX.packet. (but this is quite a minor point)

comment:8 Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to jelte

review completed, finally. giving the ticket back to Jelte. enjoy:-)

comment:9 Changed 9 years ago by jelte

  • Owner changed from jelte to jinmei

(lots of) changes in r3637. I left out all the points that I simply took over and applied. Those that needed comment/discussion are below.

general design

I personally still believe it's better to move DNS protocol specific
behavior out of the data source module as you perhaps felt. I've
commented on some specific points in this context below, too. But
assuming this is an initial attempt that will be revisited, I'm okay
with completing this work without touching this level of design issue.

The reason I still think we shouldn't is that I feel that for some
modules, there might be a significant gain of having the ability to
override the high-level methods. Perhaps this is not true, but it was
the original idea for the (nonwritable) datasource too :)

Likewise, I still believe the current class hierarchy is not well
defined. The fact that many derived methods end up just returning
'not implemented' means (IMO) the relationship between the base and
derived classes is not a reasonable "is-a" relationship. But, again,
I'm okay with completing this work without touching this level of
topic.

Assuming for a minute that we do leave the protocol processing in c++,
what about making a DataSource? and a WritableDataSource? (which itself
is a subclass of DataSource?)?

Part of this is not necessarily writable-related;

general comments about protocol processing for dynamic update and IXFR

In general, it seems the current implementation needs more checks to
make it protocol compliant (see specific comments for some specific
issues). It's also not clear how much of the protocol processing
should be done in the C++ code, assuming we'll write python daemons
for these protocols.

If we can cleanly separate dynamic update and IXFR part of the code,
maybe we should merge the general part (possibly with the part
necessary for AXFR and zone loading) first, and then revisit the
protocol processing part with more unit tests (or even reconsider
whether to provide C++ code for them). Or, perhaps we could integrate
these parts as a "work in progress" item for now (after review, of
course) and not actually use them in our server programs, and
revisit/redesign them later.

All signs, btw, that this task was a tad on the side of way too big ;)

Currently both IXFR and dynamic update are implemented in data_source,
and as separate methods, so, apart from the fact that update takes an
actual DNS message, and IXFR a (python-style) generator, it's already
pretty much separated in terms of code (as is AXFR, which can simply
be done with replaceZone()).

data_source.cc

o I'm afraid the current implementation of 'equality' check is incomplete. Consider the following corner cases (some of which may not be so rare)

+ RRsets with different TTLs
+ there's a duplicate RDATA (e.g. there are two A RRsets, one contains a single '192.0.2.1', and the other contains two '192.0.2.1's.
+ RDATA is equal as a set, but the ordering is different.

o FYI, BIND 9 takes care of the ordering issue. ignoring TTLs is probably okay in this context (and I know it's even 'necessary') especially because it's a private helper function. but I'd add an explicit note that TTL is ignored in the definition of 'equal'.

Yes, actually I think equality (and internal sorting) should go into
the RRset class. Come to think of it, in some cases the TTL could be
important, so perhaps that should be an option (or a second equality
method).

I made it static (now local namespace) mostly for that reason, did get
around to adding it to RRset yet.

  • haveRRset()

o is it okay to assume the IN class for ANY/NONE?

did our datasources have one class per datasource? should we use that?

o as a public method, the logic is too dynamic update-specific and counter intuitive. this could actually be non public, non method function?

renamed it updateHaveRRset to reflect that it is not for general use.
Did leave it public (for the same reasons that most functions are public right now; we might want to override them for specific datasources.

  • updateCheckPrerequisite()

o what about case RRSIG? (shouldn't we cover type-covered? FYI, BIND9 does something special in case of updating RRSIG)

Does it? Interesting. I was under the impression that DNSSEC did not provide additional burden on dynamic updates. Do we have that documented somewhere?

o does this code cover empty non-terminal cases?

it should; it does check rdata counts > 0. Added a test for the check name prerequisite to make sure. (btw that test should probably test all of them :) )

  • updateProcessUpdate()

o I'm not so sure if we can omit pre-scan for in memory data source.

I was kind of thinking we should have transactions for in-mem as well. If not, I guess we can refactor so that doUpdate does do the prescan.

o (comment) RDLEN check must be in dns::Message (otherwise it would fail to parse the request), but double check is probably necessary, too
o this condition is insufficient (NONE should also be checked)

if (update->getClass() != RRClass::ANY()) {

o meta RR types should be checked for the case of class = NONE
o about "other special handling", we should probably refer to BIND 9. It performs more checks. (bind9/bin/named/update.c:update_action())

i refactored it to better reflect the rfc's pseudocode. Apart from the NOTZONE check (which is still not there btw), I don't readily see any other checks in the bind9 prescan code btw.

  • callbackHelperRRsetIterator()
  • many questions including the design, but anyway...

Perhaps some Enumerator class can also be used, as long as we have the possibility of providing our own 'one-at-a-time' functions (like python iterators)

The main goal is to have *something* that does not make us copy full messages or lists/vectors (which the first design did have; for AXFR we needed to append all incoming messages).

(other comments addressed)

  • doIXFR()
  • (a high level comment) this code assumes if the first two RRs are of SOA it's IXFR, but, technically, it's not really correct because an AXFR could also consist of just two SOA RRs (although, of course, the resulting zone would become bogus as it wouldn't have an NS). For that matter, I'm not really sure how much of this level of protocol processing is assumed to be done in the C++ code. Assuming most of protocol level consideration for XFRs should go to python, ideally we should be able to concentrate on operations at the level of data source abstraction, i.e., simply adding/deleting DB rows or replacing a table, without worrying about their DNS protocol implications.

right, we do need to have that discussion, since i disagree :)

the check whether the 'final' SOA is indeed a SOA is done a few lines later (if it's not it's treated as exfr and replaceZone() is called)

  • (aside from the high level comment) I'd treat protocol-level errors such incorrect sequence of SOAs as an exception, or at least as a different type of error than W_xxx (because these errors are not really a data source level error).

add an (I)XFRException? I'm not too sure about this. Not that i'm opposed, but my impression of where we would draw the line of making something an exception would be above this.

  • the first while loop seems to be a duplicate of the internal of Sqlite3DataSrc::replaceZone(). Can't we unify the code?

perhaps, but we'd need some sort of 'reset' since the first two records have already been 'eaten'. Come to think of it, the first non-soa record in the axfr-as-ixfr packet wasn't added to the code (yes, again, missing tests)

  • this comment is incomplete?

We don't delete the actual SOA itself,

no just wrong punctuation

  • I'm not sure if this behavior is reasonable:

check if rrset exists, if not, something is very wrong, abort

RFC1995 doesn't seem to talk about this case, and FYI BIND 9 is generous about such a case:

} else if (result == DNS_R_UNCHANGED) {

/*

[...]

  • It may happen when receiving an IXFR
  • from a server that is not as careful.
  • Issue a warning and continue. */

I think bind9 is too lenient in a lot of cases ;)

I personally prefer erroring, since obviously the master disagrees with the slave on the zone contents. But my opinion isn't too strong in this case, so I guess I could be persuaded to just log a warning and move on.

  • doUpdate()

o a high level comment: like doIXFR(), I'd wonder how much of protocol processing should be in the C++ lib.

open for discussion, still think 'as much as possible' :)

o shouldn't we care about state mismatch like doIXFR()? (interestingly, doIXFR() doesn't care about the case of Opcode mismatch. so the Opcode version of this question should go to doIXFR(). and, I'd treat an Opcode mismatch as an exception, too)

yes we should. OTOH in the current API, doIXFR is not called on a Message, so it cannot check.

o I don't understand the comment 'should we do transaction here?'

I was wondering if we should not have the transaction passed in the call to doUpdate, but have doUpdate call start/rollback/commit itself.

o (aside from the high level comment above) I'd treat protocol-level errors such as zone name mismatch as an exception a different type of error than W_xxx (see the same comment to doIXFR()).

See above :)

o couldn't/shouldn't we use the postfix form of ++?

it++) {

prefix you mean? probably, done.

(I'd personally rather use an STL algorithm or boost foreach, but wouldn't insist on it)

Wouldn't that need a more general getSection() in Message? I tried keeping messing around in other parts to a minimum

o is it okay to clear the message in case of error? i.e. shouldn't we use makeResponse()?
o same comments apply to the update section. in addition, why don't we update 'msg' in case of error?

Hmm, the caller should probably do the message handling (and that probably ties back to the discussion where the intelligence should be ;) )
For now, I'll let it return the Rcode instead of WriteResult?, and let the caller handle the response.

o I think we should be more careful about cases where rollbackTransaction() fails. When that happens it means something catastrophic, right? So, IMO we should have rollbackTransaction() report the type of error in as detailed as possible (whether it's via a return code or an exception), and deliver its implication to the user at all cost (for example, by pre-allocating necessary resources to log this event without triggering an exception). Also in this context, I'd request rollbackTransaction() to limit the possible exceptions within it so that this destructor can catch any possible exceptions more selectively (in fact, as far as I can see Sqlite3DataSrc::rollbackTransaction() should never throw an exception). (btw: in case I'm not clear I understand we cannot let an exception be propagated from a destructor)

Yes, I agree, the catch-all is to be safe in the case other datasources do (accidentally) throw an exception.
I for now also check the return value of rollback, but atm again with only a comment that we should log.

data_source.h

  • it's mostly the same for rrset.h, if we change ConstRRsetPtr to const RRset& where possible as I suggested above. The only tricky part is the return value of callbackHelper*(), but I guess it could be a bear pointer in this context

Using bare pointer now, though we do lose the usefullness of the shared ptr here... thinking about the correct location to mention that in the docs :)

  • callbackHelperXXX(): I wonder whether it might make sense to do it in more C++-ish way. For example, we might define the "RRsetEnumerator" class, whose operator() returns the "next" RRset(ptr). We also define derived classes IteratorRRsetEnumerator, VectorRRsetEnumerator, etc, depending on the internal container. Then doIXFR() would look like:

DataSrc::WriteResult?
DataSrc::doIXFR(DataSrcTransaction?& transaction,

RRsetEnumerator& enumerator)

{

...
ConstRRsetPtr final_soa = enumerator();
if (!final_soa) {

return (W_ERROR);

}
ConstRRsetPtr first_soa = enumerator();
if (!first_soa) {

return (W_ERROR);

}
...

}

This is not only about a matter of style, but also about more practical advantages. That is, this way we can make it type safer by not relying on passing around void* values and cast them to particular type of values. Also, with this approach we can hide derived classes like "VectorRRsetEnumerator" in the test code (it seems to me this one is introduced for the convenience of tests; in practice we'd normally if not always use RRsetIterator).

Alternatively we could make doIXFR(), etc templated so that it can
take generic iterators. It might be more C++-ish, although we
couldn't do that in this current design because doIXFR() is a
virtual function.

In either case, for a vector based interface I'd use the begin and
end iterators instead of an index.

The vector-based one was really meant as an example of how something
else than an iterator could be used. As I may have mentioned earlier;
the biggest target here is to be able to pass a function. I guess this
can be done with an enumerator class as well (one that wraps said
function); But in practice, for AXFR you don't want to build the list of
records in advance, you want to read out Message objects one at a time
and read the next when you're done with the current. The IXFR RFC
doesn't specifically mention this part from AXFR, but we assumed that
large IXFR over TCP can also be split up over several messages as well.

Therefore the actual XFR handler (in trac376), provides its own iterator
function that returns the next RRset until it has read the entire
message, then fetches the next one (or until it is done).

Again, if you really think we should to this with a new virtual class,
we can do that, so far i've tried to keep the number of classes a bit
limited, but I should probably stop doing that :)

o I'm not sure if it makes sense to attribute "zone name/class" to this class, because a transaction in a data source may not always be specific to a single zone. for example, we may want to add more than one zones to a data source as an atomic transaction. Also, even for the case where we add a single zone (which is not possible through the current public interface though), since the zone information is an initial attribute of DataSrcTransaction??, it would look like

data_source.addZone(trans);

this is a bit awkward in terms of interface consistency because when we add an RRset we pass the RRset(ptr) to be added.

  • so, I wonder whether we might make this class is a smaller primitive: just holding a scoped transaction for a data source, similar to the Boost scoped_lock. Then the code would look like as follows:

void doIXFROrSomething(data_source) {

DataSrcTransaction? trans(data_source);

do whatever necessary within the transaction

trans.commit();

}

The reason to tie a transaction with a specific zone is intended so we can do a similar thing as for queries; if you start a transaction on the highest level (ie. the meta data source), it should be able to automatically find the correct datasource for the zone you provide (which in this way is enforced by the API, though currently not yet implemented in MetaDataSrc?).

After a bit of discussion with Evan, we decided that the way to add a zone would be to have an option in startTransaction that create it if it does not exist yet. Depending on the datasource, this could then be denied or accepted (some datasources might not support the addition of zones).

Below, however, I'll continue the review assuming the current interface for now.

  • member variable names shouldn't begin with "_" (which is (maybe by convention) reserved for standard definitions, in my understanding)

After doing a bit of python (where one should prepend private variables with _, it kind of stuck. Changing them to end with _.

  • I'm not sure why we pass a pointer of data source. can it be NULL?

no, but it can't be copied, and it may end up being a different one than the one, see earlier comment about MetaDataSrc?

  • getZoneName() could return a reference to _zone_name, which would be a bit more efficient.
  • I'd feel nervous about the setters (setState() and setData()). Due to them _state and _data are effectively public member variables, and make the class quite fragile to careless (buggy) usage. for example, a buggy implementation could set the state from "RUNNING" to "INIT" (or to "DONE" without commit or rollback), which would cause something very weird. Likewise, an application may modify or even reset to NULL the internal data. Since the Sqlite3DataSrc implementation assumes it's always a valid pointer containing expected data, it would make the application crash. We could blame the application saying it's "their bug", but it would be much better to prohibit such risky usage via the interface.

For state, sure, for data, i don't see how (since the whole point of the class is to give datasources the ability to store whatever they need in it)

  • A simpler "scoped transaction" would be one possibility. Another is to introduce a class hierarchy of transactions from a base abstract class to a specific derived class per data source. With this approach, we define e.g. Sqlite3DataSrcTransaction derived from base DataSrcTransaction?? and have an instance of Sqlite3DataSrc create an associated transaction:

DataSrcTransaction?* trans = data_source.createTransaction();
here, if data_source is of Sqlite3DataSrc, trans is of
Sqlite3DataSrcTransaction.

data_source.addRRet(*trans, rrset); or something like this

Then, we can hide the interface and implementation within the definition of Sqlite3DataSrcTransaction.

Yes, that is another way to go. Once again, I tried to keep the actual number of classes a bit limited. I did not want to force module writers to have to subclass multiple classes. Though it is open for discussion, and people may convince me otherwise :)

  • DataSrc class: the above design-level comment would also affect the modifications to DataSrc, but here I'll review it as is.

o in general, I'd like to know whether/when the methods throw an exception.
o addRRset(): '\return' is missing, in which W_SUCCESS should be. I'd be interested in what happens if the same <name,type> of rrset already exists.
o delRRset(): \return is missing. what if no such RRset exists?
o delZone(): there should be a period at the end of \brief line. otherwise doxygen considers the next line a part of the brief description. what if no such zone exists?

Since the WriteResult? already have internal documentation, i tend to simply say 'error value otherwise', in this case i'll make an exception for not implemented, but I prefer not te re-enumerate all cases for every function.

sqlite3_datasrc.cc

  • in ~Sqlite3Initializer(), shouldn't we handle w_add_all_, etc, too? and, this seems to indicate that it's time to introduce some refactoring to make the code robust.

probably :)

  • Sqlite3DataSrc::startTransaction()

o I'd avoid hardcoding keyword "zone_id" for trans_data. same for other part of the code that uses this keyword.

Resolution pending other discussion of having subclassable transaction class.

  • Sqlite3DataSrc::addZone()

o in the first call to sqlite3_bind_text(), it seems that the last
parameter can be SQLITE_STATIC. Also, if we use TRANSIENT, we can
avoid having a temporary variable (s_name), although I wouldn't be so
nervous about a "redundant" variable if it's a const.

Furthermore, the mixed usage of STATIC and TRANSIENT may confuse
code readers (and might trigger an accidental regression in future
by a careless developer). Ideally we should keep the consistent
style, or at least leave a comment about our policy of when to use
STATIC/TRANSIENT. (one possible approach is to use TRANSIENT
consistently in the "write" operations because its performance
impact should be relatively less important).

Yeah I removed the variable(s) again (which imo, was the inconsistency
here, not the use of STATIC vs TRANSIENT itself, though the wrong use
of TRANSIENT was most probably because of the addition of said variable).

I'm not sure i agree with the possible approach you mention;
having it depend on what type of function we are in ourselves (though
now it does follow that approach, since I have not updated the
BTW, I think using STATIC can always make careless developers trip,
even if we consistently use it.

If you strongly prefer the other way around (always use local variable
and STATIC), I would probably not argue that much though :)

  • what should we do if the zone for the <name, rrclass> already exists?

The (current) API design should prevent that from happening (and we
couldn't test it because of that).

  • I'd avoid using a temporary mutable variable 'rc' if we don't use it except for '!= SQLITE_OK'. same for 'rc' in other methods.

As with the other thing(s), should i change the existing code as well? (some things were actually taken over for 'style' consistency :))

  • Sqlite3DataSrc::addRRset()

o in this context we cannot naively assume getData() or get("zone_id") returns a valid value. (see also the higher level comment about the design)

Left in, pending discussion.

  • Sqlite3DataSrc::delRR(without rdata)

o not specific to this method, but as I've seen so many repetitive patterns around sqlite3_bind_text, I wonder we might unify this pattern to avoid repeating the duplicate logic. e.g, we might define something like

void bindName(sqlite3_stmt* query, const string& name_str, const int column) {

if (sqlite3_bind_text(query, column, name_str.c_str(), -1,

SQLITE_STATIC) != SQLITE_OK) {

isc_throw(...);

}

}
void bindName(sqlite3_stmt* query, const char* const name_cstr, const int column) {

if (sqlite3_bind_text(query, column, name_cstr, -1,

SQLITE_TRANSIENT) != SQLITE_OK) {

isc_throw(...);

}

}

then we can simplify each method like as follows:

...
bindZone(query, 1, zone_id)
bindName(query, 2, name.toText().c_str());
bindRRType(query, 3, RRType::ANY() ? "%" : rrtype.toText().c_str());
...

just a thought.

Yeah sounds like a good idea. Though with all the rest still here skipping that for now :)

  • Sqlite3DataSrc::delRR(with rdata)

o most part of the code is a duplicate of the non rdata version. smells like the need for refactoring.

in relation to the earlier comment, i don't think it would be that much (it's almost all binding and checks for that)

  • Sqlite3DataSrc::delAll()

o at the end of the method, for such a simple if-else pattern I'd use ?: to keep the size of the method shorter:

return (result == SQLITE_DONE ? W_SUCCESS : W_DB_ERROR);

I was under the impression our style guide said not to use that construct, especially for return?

sqlite3_datasrc.h

  • The public methods do not simply follow the interface defined in the base class, but have some sqlite3 specific behavior (especially about when to throw exceptions). So I think we should provide documentation for these methods focusing on the specific behavior while referring to the base class.

that's an oversight; which i have fixed now (i don't think there should be any implementation-specific exceptions bubbling out here; i've made Sqlite3Error a subclass of DataSourceError?)

o according to lcov there are some parts that are not tested, including:

+ some of the "higher level" methods are not tested directly.
+ some FORMERR cases in updateCheckPrerequisite()
+ some error cases in updateProcessUpdate()
+ same for doIXFR()

o overall, the newly added tests rather seem to belong to the higher level layer. I didn't see much (if any) part in the tests that's specific to sqlite3. also, if we test them at the higher level using some mock data source, it will be easier to test cases with DB errors.

Right, will look into this, but thought it was high time to put
something out there, so you can look at the rest first :)

  • do we need to copy the writable data source file to the build
  • dir in the test? ... ah, I see we need to do that to "reset the
  • DB". actually, in that case, I'd introduce a separate class,
  • always rest the DB in each test's initialization phase
  • (constructor or SetUp??()). see also the next bullet and the
  • relevant comment on the delRRset test below.

BTW, we also need to copy it because the source tree might not be
writable.

  • general comments about helper functions

o I think it should be defined somewhere else (possibly with some generalization) so that we can share it in other test cases. we might also want to integrate this feature into lib/dns/tests/testdata/gen-wiredata.py
o from this point of view, it's not very clear how to populate test data other than by reading short comments about rrsetsFromFile() and source code + test examples. we'll need more friendly description about how to use it. (admittedly, gen-wiredata.py is bad on this point, too, and dns/unittest_util is not very good either)

Yes, I agree, and as discussed on jabber, I think some of this is even suitable for the DNS library itself. So I'm leaving this in for now.

  • questionFromFile()

o I believe the return value could/should be of ConstQuestionPtr??.

should Message::addQuestion also have an ConstQuestionPtr? argument? (or should I use *constquestionptrvar)

<snipping part of helper functions>

I applied some of the smaller commentary (and took over your code block)
then left most of it as-is. If we're gonna do this in the library anyway,
i don't want to waste too much effort here right now (if not, I need to
revisit this). Once issue that would occur should we for instance
refactor vector and msg versions of rrsetsFromFile is that
Message::addRRset wants an RRsetPtr and not a ConstRRsetPtr. But it may
be early and I may be missing something here :)

(for that matter, we may need a "merge mode" for the RRset class or

introduce a notion of "RR" and provide a primitive of "addRR" for
usage like this. but that's a different topic)

+1

comment:10 Changed 9 years ago by shane

  • Milestone set to A-Team-Task-Backlog

comment:11 Changed 9 years ago by jinmei

  • Owner changed from jinmei to UnAssigned

comment:12 Changed 9 years ago by stephen

  • Milestone A-Team-Task-Backlog deleted

Milestone A-Team-Task-Backlog deleted

comment:13 Changed 9 years ago by jinmei

  • Owner changed from UnAssigned to jinmei
  • Status changed from reviewing to accepted

I'm removing this ticket from the review queue for now as hinted on jabber. I've added a link to this ticket from one of the
IXFR tickets (#799) so that we won't easily lose this one.

comment:14 Changed 9 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to assigned

comment:15 Changed 8 years ago by shane

  • Defect Severity set to N/A
  • Owner changed from UnAssigned to jelte
  • Sub-Project set to DNS

I think that this ticket has been Overtaken By Events, which is a shame but c'est la vie.

Jelte, what do you think? Shall I resolve this as "historical"?

comment:16 Changed 8 years ago by jelte

  • Milestone set to New Tasks
  • Resolution set to invalid
  • Status changed from assigned to closed

Ok, i think 'invalid' comes closest.

comment:17 Changed 8 years ago by shane

  • Milestone New Tasks deleted
Note: See TracTickets for help on using tickets.