Opened 7 years ago

Closed 7 years ago

#2905 closed defect (fixed)

buggy zone should result in SERVFAIL, not REFUSED

Reported by: jinmei Owned by: jinmei
Priority: medium Milestone: Sprint-20130528
Component: data source Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket:
Estimated Difficulty: 5 Add Hours to Ticket: 0
Total Hours: 17.32 Internal?: no

Description

With the current implementation of in-memory data source, if a zone
is configured to be loaded to memory but the load of the zone failed
due to a non fatal error (an error in the zone file or post load
validation failure), the result will be the same as if the zone isn't
configured to be loaded at all.

As a result, b10-auth would return REFUSED for queries in the zone
(if there is no other matching zone in the memory or in other data
sources), or if a shorter matching zone is loaded, it would search
that loaded zone. This is at least different from BIND 9, and I
believe the BIND 9 behavior is correct: the data source should still
recognize the zone and return SERVFAIL to queries for that zone.

We should be able to do it as follows:

  • wait until #2834 is merged
  • allow ZoneTable::addZone to add a NULL ZoneData.
  • In ConfigurableClientList::configure, if load() fails due to ZoneLoaderException, add a NULL ZoneData to the table.
  • In InMemoryClient::findZone, if findZone() returns a NULL zone data, return a special ZoneFinder that would throw DataSourceError (or something) on any method that would require ZoneData.

This way, I believe we don't have to update b10-auth itself.

Subtickets

Change History (12)

comment:1 Changed 7 years ago by muks

  • Estimated Difficulty changed from 0 to 5

comment:2 Changed 7 years ago by jinmei

  • Milestone changed from Previous-Sprint-Proposed to Next-Sprint-Proposed

comment:3 Changed 7 years ago by muks

  • Milestone changed from Next-Sprint-Proposed to Sprint-20130528

comment:4 Changed 7 years ago by jinmei

  • Owner set to jinmei
  • Status changed from new to accepted

comment:5 Changed 7 years ago by jinmei

trac2905 is ready for review.

In the end it became a bit more complicated than I originally
expected, but hopefully it's still manageable:

  • I realized we cannot simply pass NULL to addZone, because it would create an empty node in the underlying domain tree, and for the zone table empty nodes cannot be found by the find() method. So I introduced the concept of "empty zone", which is created a separate special factory method as a kind of null object.
  • At the higher level, I chose to signal the condition of "a matching zone is found but it's empty (broken)" through the return value of ClientList::find(), instead of throwing an exception. This is because it wouldn't be so uncommon that we have this type of broken zones, and handling such a case via exception can be too expensive. As a result, I also ended up modifying the auth server code a bit.

The branch consists of a set of relatively small changes so it covers
this scenario as a whole. I believe each chunk of set should be
pretty straightforward, but there are several of them, so the branch
may look a bit big.

This is an overview of the changes:

  • update the ZoneData class to support the concept of empty zone
  • update the ZoneTable class to allow adding empty zone data
  • update the ZoneWriter class so it can catch zone loading exceptions internally and install empty zone data in that case. ZoneWriter::load() is also updated so the caller can get the loading error as a string to use for logging.
  • update InMemoryClient with the consideration of empty zones.
  • update ClientList so it will create ZoneWriter that would catch load errors internally
  • update the auth Query class to return SERVFAIL when it encounters an empty zone.

There are some other small cleanups and updates, which should be
straightforward.

Proposed changelog entry:

617.?	[bug]		jinmei
	b10-auth now returns SERVFAIL to queries for a zone that is
	configured to be loaded in-memory but isn't due to load time
	errors (missing zone file or errors in the zone file, etc).
	Such zones were previously treated as non existent and would
	result in REFUSED or unintentional match against less specific
	zones.  The revised behavior is also compatible with BIND 9.
	(Trac #2905, git TBD)

Finally, while not related to this task, I'd note that the concept of
"empty zone" would be useful when we want to support caching zone data
for a subset of zones stored in the underlying data source while still
recognizing all existent zones (so queries will be answered from
either cache or the underlying data source, wherever available).
That is, we'll be able to know whether zone exists only by accessing
the in-memory cache (thus much faster), and do lookup in the
underlying data source only when we know it exists but is not just
cached.

comment:6 Changed 7 years ago by jinmei

  • Owner changed from jinmei to UnAssigned
  • Status changed from accepted to reviewing

comment:7 Changed 7 years ago by vorner

  • Owner changed from UnAssigned to vorner

comment:8 follow-up: Changed 7 years ago by vorner

  • Owner changed from vorner to jinmei

Hello

I know this should not cause any disruption, but isn't it better to keep the values of constants and allocate new numbers for the new ones?

+    static const ZoneNode::Flags EMPTY_ZONE = ZoneNode::FLAG_USER2;
 public:
     /// \brief Node flag indicating it is at a "wildcard level"
     ///
     /// This means one of the node's immediate children is a wildcard.
-    static const ZoneNode::Flags WILDCARD_NODE = ZoneNode::FLAG_USER2;
+    static const ZoneNode::Flags WILDCARD_NODE = ZoneNode::FLAG_USER3;

I think the FindeResult::flags in the zone table might benefit from little documentation.

The exception text „Zone content must not be NULL or empty“ is understandable, but only with some thinking about what it means. Should it better be „nor“?

The comment about empty at the first line here doesn't hold any more (or, is confusing at least, as empty now means something else than NULL).

The empty here no longer applies
     // It doesn't accept empty (NULL) zones
     EXPECT_THROW(zone_table->addZone(mem_sgmt_, zclass_, zname1, NULL),
-                 isc::BadValue);
+                 isc::InvalidParameter);
     EXPECT_EQ(0, zone_table->getZoneCount()); // count is still 0
 
+    // or an empty zone
+    SegmentObjectHolder<ZoneData, RRClass> holder_empty(
+        mem_sgmt_, zclass_);

Would it be better to reset the header after the whole try-catch block?

void
ZoneWriter::load(std::string* error_msg) {
    if (state_ != ZW_UNUSED) {
        isc_throw(isc::InvalidOperation, "Trying to load twice");
    }

    try {
        zone_data_ = load_action_(segment_.getMemorySegment());
        segment_.resetHeader();

        if (!zone_data_) {
            // Bug inside load_action_.
            isc_throw(isc::InvalidOperation,
                      "No data returned from load action");
        }
    } catch (const ZoneLoaderException& ex) {
        if (!catch_load_error_) {
            throw;
        }
        if (error_msg) {
            *error_msg = ex.what();
        }
        segment_.resetHeader();
    }

    state_ = ZW_LOADED;
}

This doxygen example could be more consistent in calling methods instead of having comments to do something. So, instead of „broken zone, handle it accordingly“, should it be something like createServFail();?

    ///   if (result.datasrc_) {
    ///       if (result.finder_) {
    ///           createTheAnswer(result.finder_);
    ///       } else {
    ///           // broken zone, handle it accordingly.
    ///       }
    ///   } else {
    ///       createNotAuthAnswer();
    /// } \endcode

Should this check the result is actually empty (eg. has the flag somewhere)?

    // check the result with empty (broken) zones.  Right now this can only
    // happen for in-memory caches.
    void emptyResult(const ClientList::FindResult& result, bool exact,
                     const char* trace_txt)
    {
        SCOPED_TRACE(trace_txt);
        ASSERT_FALSE(result.finder_);
        EXPECT_EQ(exact, result.exact_match_);
        EXPECT_TRUE(dynamic_cast<InMemoryClient*>(result.dsrc_client_));
    }

The loadLoaderException test should probably test passing NULL error parameter, to check it does not crash by assigning to the pointer unconditionally.

Just to make sure, the changes in 2f0e203d44adda18249239ff33a75102a8477e82 are not yet used at all in that commit and are preparation for the later commits, are they?

comment:9 in reply to: ↑ 8 ; follow-up: Changed 7 years ago by jinmei

Thanks for the review.

Replying to vorner:

I know this should not cause any disruption, but isn't it better to keep the values of constants and allocate new numbers for the new ones?

+    static const ZoneNode::Flags EMPTY_ZONE = ZoneNode::FLAG_USER2;
 public:
     /// \brief Node flag indicating it is at a "wildcard level"
     ///
     /// This means one of the node's immediate children is a wildcard.
-    static const ZoneNode::Flags WILDCARD_NODE = ZoneNode::FLAG_USER2;
+    static const ZoneNode::Flags WILDCARD_NODE = ZoneNode::FLAG_USER3;

I actually wondered the same thing. I chose changing the value so we
can consolidate the private variables (and yet avoid listing them in a
counter intuitive order, like USER1, USER3, then USER3). But,
especially because we are going to write the image in a mapped file,
it's probably better to preserve binary compatibility as much as
possible. So I revised the code that way (with one related cleanup).

I think the FindeResult::flags in the zone table might benefit from little documentation.

Hmm, not sure exactly which part of th code you meant, but I've tried
to make some clarification in zone_table.h.

The exception text „Zone content must not be NULL or empty“ is understandable, but only with some thinking about what it means. Should it better be „nor“?

(Readability aside) in my understanding "not A or B" is the
grammatically correct form while "not A nor B" isn't. But in any case
I've revised it more fundamentally. Hopefully that one is clearer.

The comment about empty at the first line here doesn't hold any more (or, is confusing at least, as empty now means something else than NULL).

The empty here no longer applies
     // It doesn't accept empty (NULL) zones
     EXPECT_THROW(zone_table->addZone(mem_sgmt_, zclass_, zname1, NULL),
-                 isc::BadValue);
+                 isc::InvalidParameter);
     EXPECT_EQ(0, zone_table->getZoneCount()); // count is still 0
 
+    // or an empty zone
+    SegmentObjectHolder<ZoneData, RRClass> holder_empty(
+        mem_sgmt_, zclass_);

Okay, revised.

Would it be better to reset the header after the whole try-catch block?

I expect in #2850 we'll eliminate the need for resetHeader()
(actually this use of resetHeader() was not completely exception
safe regardless where we call it). So I've kept this part of the code
at the moment. I'll check that again once #2850 is merged (or if this
branch is merged first, at the time of merging #2850).

This doxygen example could be more consistent in calling methods instead of having comments to do something. So, instead of „broken zone, handle it accordingly“, should it be something like createServFail();?

    ///   if (result.datasrc_) {
    ///       if (result.finder_) {
    ///           createTheAnswer(result.finder_);
    ///       } else {
    ///           // broken zone, handle it accordingly.
    ///       }
    ///   } else {
    ///       createNotAuthAnswer();
    /// } \endcode

Okay, updated.

Should this check the result is actually empty (eg. has the flag somewhere)?

    // check the result with empty (broken) zones.  Right now this can only
    // happen for in-memory caches.
    void emptyResult(const ClientList::FindResult& result, bool exact,
                     const char* trace_txt)
    {
        SCOPED_TRACE(trace_txt);
        ASSERT_FALSE(result.finder_);
        EXPECT_EQ(exact, result.exact_match_);
        EXPECT_TRUE(dynamic_cast<InMemoryClient*>(result.dsrc_client_));
    }

It checks that by confirming finder_ is NULL and dsrc_client_ isn't.
I considered introducing some more explicitly interface that indicates
the zone is empty/broken, but adding another boolean didn't seem to be
a good idea because that one and finder_ and dsrc_client_ are related
and we'll need to be careful not to break the integrity. Adding a
method like isZoneEmpty() (return true iff both finder_ and
dsrc_client_ are non NULL) would be a possible idea, but for the user
like auth query.cc it didn't seem to be very useful anyway.

At the moment I didn't touch the code regarding this.

The loadLoaderException test should probably test passing NULL error parameter, to check it does not crash by assigning to the pointer unconditionally.

It's at least implicitly tested as NULL is the default parameter
value. I clarified that point in a comment. We might test the case
of passing NULL explicitly too, but for now I've only made the comment
clarification, preferring conciseness.

Just to make sure, the changes in 2f0e203d44adda18249239ff33a75102a8477e82 are not yet used at all in that commit and are preparation for the later commits, are they?

Correct. The QueryTest.emptyZone test in the next commit relies on
2f0e20.

comment:10 Changed 7 years ago by jinmei

  • Owner changed from jinmei to vorner

comment:11 in reply to: ↑ 9 Changed 7 years ago by vorner

  • Owner changed from vorner to jinmei
  • Total Hours changed from 0 to 1.32

Hello

Replying to jinmei:

Hmm, not sure exactly which part of th code you meant, but I've tried
to make some clarification in zone_table.h.

Yes, I think it was this one.

The exception text „Zone content must not be NULL or empty“ is understandable, but only with some thinking about what it means. Should it better be „nor“?

(Readability aside) in my understanding "not A or B" is the
grammatically correct form while "not A nor B" isn't. But in any case
I've revised it more fundamentally. Hopefully that one is clearer.

But I believe such would translate to (!A) || B, which is wrong. It should AFAIK be „Neither A nor B“.

Anyway, the revised version looks good.

It checks that by confirming finder_ is NULL and dsrc_client_ isn't.
I considered introducing some more explicitly interface that indicates
the zone is empty/broken, but adding another boolean didn't seem to be
a good idea because that one and finder_ and dsrc_client_ are related
and we'll need to be careful not to break the integrity. Adding a
method like isZoneEmpty() (return true iff both finder_ and
dsrc_client_ are non NULL) would be a possible idea, but for the user
like auth query.cc it didn't seem to be very useful anyway.

Hmm. I just got the impression that something like 5 different places got the flags, so this variable must have gotten them too. But I was wrong, obviously. In such case, this is OK.

It's at least implicitly tested as NULL is the default parameter
value. I clarified that point in a comment. We might test the case
of passing NULL explicitly too, but for now I've only made the comment
clarification, preferring conciseness.

Ah, OK then.

It looks good to be merged.

comment:12 Changed 7 years ago by jinmei

  • Resolution set to fixed
  • Status changed from reviewing to closed
  • Total Hours changed from 1.32 to 17.32

merged. closing.

Note: See TracTickets for help on using tickets.