Opened 8 years ago

Closed 8 years ago

#1314 closed task (complete)

IXFR-out system tests

Reported by: stephen Owned by: jelte
Priority: medium Milestone: Sprint-20111220
Component: Unclassified Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity: N/A
Sub-Project: DNS Feature Depending on Ticket: IXFR-out
Estimated Difficulty: 7 Add Hours to Ticket:
Total Hours: 9.63 Internal?: no

Description

#1210 produced a specification for IXFR system tests. This ticket is for the implementation of the part of the tests relating to IXFR-out functionality.

Subtickets

Change History (20)

comment:1 Changed 8 years ago by shane

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:2 Changed 8 years ago by jelte

  • Add Hours to Ticket changed from 0 to 7

comment:3 Changed 8 years ago by jelte

  • Add Hours to Ticket 7 deleted
  • Estimated Difficulty changed from 0 to 7

comment:4 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to New Tasks

comment:5 Changed 8 years ago by shane

  • Milestone changed from New Tasks to Next-Sprint-Proposed

comment:6 Changed 8 years ago by jelte

  • Milestone changed from Next-Sprint-Proposed to Sprint-20111206

comment:7 Changed 8 years ago by jelte

  • Owner set to jelte
  • Status changed from new to assigned

comment:8 Changed 8 years ago by shane

  • Feature Depending on Ticket set to IXFR-out

comment:9 Changed 8 years ago by jelte

  • Owner changed from jelte to UnAssigned
  • Status changed from assigned to reviewing
  • Total Hours changed from 0 to 5

I've created a new terrain piece (for throwing transfer requests at a server), and implemented the tests.

They *should* all work, however, since we don't support UDP IXFR; most of them (all but one, in fact), are commented out for now.

I have tested these tests against a BIND9 with the same data and history, and most of the commented-out tests work, except for the ones where bind 9's behaviour differs from the ones in the test (tests 3 and 4 of test set 1, and tests 1 and 2 of testset 2).

The transfer.py code should also be usable should we want to do AXFR tests, but we might want to extend its functionality a bit then (for now I only needed the answer size in number of records and the full result).

comment:10 Changed 8 years ago by jinmei

  • Owner changed from UnAssigned to jinmei

comment:11 follow-up: Changed 8 years ago by jinmei

First off, the test did not always succeed (but sometimes it
succeeded). I often saw this:

    When I do an IXFR transfer of example.com 14 over tcp                                                               # features/terrain/transfer.py:81
    The transfer result should have 14 RRs                                                                              # features/terrain/transfer.py:107
    Traceback (most recent call last):
      File "/Library/Python/2.6/site-packages/lettuce-0.1.33-py2.6.egg/lettuce/core.py", line 117, in __call__
        ret = self.function(self.step, *args, **kw)
      File "/Users/jinmei/src/isc/git/bind10-1314/tests/lettuce/features/terrain/transfer.py", line 119, in check_transfer_result_count
        " records, expected " + str(number_of_rrs)
    AssertionError: Got 0 records, expected 14

And, when it failed, I found this in lettuce/output/IXFR...stderr:

Traceback (most recent call last):
  File "/Users/jinmei/src/isc/git/bind10-1314/src/bin/xfrout/b10-xfrout", line 1008, in <module>
    xfrout_server = XfroutServer()
  File "/Users/jinmei/src/isc/git/bind10-1314/src/bin/xfrout/b10-xfrout", line 896, in __init__
    self._cc.start()
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/config/ccsession.py", line 210, in start
    self.__send_spec()
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/config/ccsession.py", line 370, in __send_spec
    answer, env = self._session.group_recvmsg(False, seq)
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/cc/session.py", line 266, in group_recvmsg
    env, msg  = self.recvmsg(nonblock, seq)
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/cc/session.py", line 121, in recvmsg
    data = self._receive_full_buffer(nonblock)
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/cc/session.py", line 203, in _receive_full_buffer
    self._receive_len_data()
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/cc/session.py", line 163, in _receive_len_data
    new_data = self._receive_bytes(self._recv_len_size)
  File "/Users/jinmei/src/isc/git/bind10-1314/src/lib/python/isc/cc/session.py", line 151, in _receive_bytes
    raise ProtocolError("Read of 0 bytes: connection closed")
isc.cc.session.ProtocolError: Read of 0 bytes: connection closed

(This exception itself seems to be a kind of bug but that aside) This
seems to suggest that b10-xfrout did not fully start up when the IXFR
query was made. If that always succeeds in your environment, maybe
you can try it with adding some intentional delay (e.g.) between
these:

        set_signal_handler()
        xfrout_server = XfroutServer()

If my guess is correct, maybe we'll need something similar to
wait_for_auth.

Second, I made a couple of editorial fixes and pushed them.

lettuce/README.tutorial

(not a subject of this ticket but I first read the READMEs to
understand the test and noticed these)

  • is this a typo? The first 'no' should be removed perhaps?
    The one scenario we have no has no steps,
    
  • s/fire of/fire off/?
    This is not good enough; it will fire of the process, but setting up
    

transfer.py

I didn't understand what kind of magic was used for 'step.multiline'

    expect = re.sub("[ \t]+", " ", step.multiline)

but with common sense it looks okay.

ixfr_out_bind10.feature

I have to admit I've not read the commented out tests. The enabled
tests seem to be okay (except that they didn't always succeed as
discussed above).

comment:12 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:13 in reply to: ↑ 11 ; follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

First off, the test did not always succeed (but sometimes it
succeeded). I often saw this:

<snip>

(This exception itself seems to be a kind of bug but that aside) This
seems to suggest that b10-xfrout did not fully start up when the IXFR
query was made. If that always succeeds in your environment, maybe
you can try it with adding some intentional delay (e.g.) between
these:

If my guess is correct, maybe we'll need something similar to
wait_for_auth.

Ack, added a similar 'wait for xfrout' step. We could already do this through a 'normal' wait for stderr output, but it seems cleaner to have a separate step. And I'm not entirely happy with the line it now waits for (XFROUT_NEW_CONFIG_DONE), but xfrout doesn't log something like 'I am now done, and running'. Perhaps we should have those in all modules as some info or debug statement.

Can you see if it now works consistently? I expect the error is caused by killing the rest of the system while a module is starting up. But yeah, if so, we should probably present some nicer output there.

Second, I made a couple of editorial fixes and pushed them.

Thanks

lettuce/README.tutorial

(not a subject of this ticket but I first read the READMEs to
understand the test and noticed these)

  • is this a typo? The first 'no' should be removed perhaps?
    The one scenario we have no has no steps,
    

yes, fixed.

  • s/fire of/fire off/?
    This is not good enough; it will fire of the process, but setting up
    

ack. Made it 'it will start the process', btw.

transfer.py

I didn't understand what kind of magic was used for 'step.multiline'

    expect = re.sub("[ \t]+", " ", step.multiline)

but with common sense it looks okay.

It's a bit of a weird construct to pass longer sets of data to the function; if it reads a three quotes, the value is passed into the step's multiline member (also see http://lettuce.it/tutorial/multiline.html but that mostly talks about whitespace problems).

Typing this I realize we could perhaps also implement this using tables, but the way to get to that data is pretty similar (http://lettuce.it/tutorial/tables.html)

ixfr_out_bind10.feature

I have to admit I've not read the commented out tests. The enabled
tests seem to be okay (except that they didn't always succeed as
discussed above).

Ok. Well, they are commented out anyway (which is really not following the yagni principle, but I suspect having them ready when we do udp will be very nice, even if we don't end up using all of them, at least we'll be forced to realize that)

comment:14 in reply to: ↑ 13 ; follow-up: Changed 8 years ago by jinmei

Replying to jelte:

Replying to jinmei:

First off, the test did not always succeed (but sometimes it
succeeded). I often saw this:

Ack, added a similar 'wait for xfrout' step. We could already do this through a 'normal' wait for stderr output, but it seems cleaner to have a separate step. And I'm not entirely happy with the line it now waits for (XFROUT_NEW_CONFIG_DONE), but xfrout doesn't log something like 'I am now done, and running'. Perhaps we should have those in all modules as some info or debug statement.

Yeah, and I think we need more generic and reliable way to confirm
that a component is fully ready, not only for tests. For example, I
guess b10-auth would like to know whether xfrout/ddns is really active
rather than just naively passing requests to the corresponding UNIX
domain socket (and possibly getting an exception subsequently).

One thing we could do for tests would be to implement the 'ping'
command for all components.

Anyway, that will be beyond the scope of this ticket.

Can you see if it now works consistently? I expect the error is caused by killing the rest of the system while a module is starting up. But yeah, if so, we should probably present some nicer output there.

Yes, it now succeeds constantly.

(Some other tests still fail, btw,

    A query for www.example.org to 127.0.0.1:47807 should have rcode NOERROR                       # features/terrain/querying.py:183
...
    AssertionError: Expected: NOERROR, got NO_ANSWER

    Then wait for new bind10 stderr message XFRIN_XFR_TRANSFER_SUCCESS not XFRIN_XFR_PROCESS_FAILURE             # features/terrain/steps.py:34
...
    AssertionError: Timeout waiting for process output: [u'XFRIN_XFR_TRANSFER_SUCCESS', u'XFRIN_XFR_PROCESS_FAILURE']

"NO_ANSWER" and "timeout" seem to suggest the same thing might be
happening? (again, would be a separate issue though).

ixfr_out_bind10.feature

I have to admit I've not read the commented out tests. The enabled
tests seem to be okay (except that they didn't always succeed as
discussed above).

Ok. Well, they are commented out anyway (which is really not following the yagni principle, but I suspect having them ready when we do udp will be very nice, even if we don't end up using all of them, at least we'll be forced to realize that)

I've now quickly looked at others a bit more closely. Some comments:

  • What's the difference between Scenario 1's tests 1 and 2?
  • I guess many of the tests can be done successfully over TCP and wonder we should actually do it?

comment:15 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:16 in reply to: ↑ 14 ; follow-up: Changed 8 years ago by jelte

  • Owner changed from jelte to jinmei

Replying to jinmei:

Yeah, and I think we need more generic and reliable way to confirm
that a component is fully ready, not only for tests. For example, I
guess b10-auth would like to know whether xfrout/ddns is really active
rather than just naively passing requests to the corresponding UNIX
domain socket (and possibly getting an exception subsequently).

One thing we could do for tests would be to implement the 'ping'
command for all components.

Yes. One problem with that (at least in this context), is that it is, how do you say it, a pull-type request (you need to ping it, if it has not started, you need to ping it again after some delay). The nice thing (again, in this context) about the current approach, if supported through log messages, it that you just wait until the message comes by. Originally I had thought of having some sort of announcement channel in the msgq where modules would shout "Hi guys, I'm up!", but such things have been disappeared in the backlog somewhere :p. Either approach would be very good to have though.

Anyway, that will be beyond the scope of this ticket.

ack.

Yes, it now succeeds constantly.

"NO_ANSWER" and "timeout" seem to suggest the same thing might be
happening? (again, would be a separate issue though).

AFAIK every test should have something similar, especially for queries, but I'd need to take a look at the specific test that fails and perhaps its output. but not for this ticket indeed :)

I've now quickly looked at others a bit more closely. Some comments:

  • What's the difference between Scenario 1's tests 1 and 2?

Oops, serial of test two should be 22. Changed.

Also just noticed I made a similar mistake in scenario 2 test 3.

  • I guess many of the tests can be done successfully over TCP and wonder we should actually do it?

Some of them can (in fact, I wrote those using tcp in the first place). I reverted them before submitting to match the list on the IxfrSystemTest? page, but yeah, we can use TCP.

For those where it is possible, I changed udp to tcp, with a comment that the original requirement is for udp. Added a note at the top that we might want to keep the tcp tests once we have udp support, and perhaps reflect that back on the systest page then too.

For all tests I either added a comment that they are changed to TCP for now or that they are commented out because they are not supported yet and have no TCP equivalent.

Note: I did not do the first one, we actually fail here (I hope because this branch does not do serial comparison yet)

comment:17 in reply to: ↑ 16 Changed 8 years ago by jinmei

Replying to jelte:

It looks okay. Please feel fee to merge...
...but just one more thing ((c)Peter Falk): maybe we need a changelog
entry for it. If you also think so, please just add it on merge.
I'll take a look and if I see the need for making a comment on it
(unlikely) I'll raise it later.

comment:18 Changed 8 years ago by jinmei

  • Owner changed from jinmei to jelte

comment:19 Changed 8 years ago by jinmei

  • Total Hours changed from 5 to 7.63

comment:20 Changed 8 years ago by jelte

  • Resolution set to complete
  • Status changed from reviewing to closed
  • Total Hours changed from 7.63 to 9.63

Thanks, merged, you can see changelog in master.

Closing ticket

Note: See TracTickets for help on using tickets.