Opened 9 years ago

Closed 9 years ago

#289 closed task (fixed)

Notify-out implementation

Reported by: zhanglikun Owned by: zhanglikun
Priority: medium Milestone: 06. 4th Incremental Release
Component: xfrin Version:
Keywords: Cc:
CVSS Scoring: Parent Tickets:
Sensitive: no Defect Severity:
Sub-Project: Feature Depending on Ticket:
Estimated Difficulty: 0.0 Add Hours to Ticket: 0
Total Hours: 0 Internal?: no

Description

When zone transfer finishing, it should notify its slaves. This feature belongs to xfrin process.

The problem is: xfrin module will support one command "notify", does it make user feel tricky?

Subtickets

Change History (10)

comment:1 follow-up: Changed 9 years ago by zhanglikun

  • Owner changed from zhanglikun to stephen
  • Status changed from new to reviewing

I plan to start our review process for zone manager and notify-out based on this ticket. code branch is trac289.

The design document for zone manager is:https://bind10.isc.org/wiki/SecondaryManager (After the jabber talk with Jelte and Shane, we agreed to change the name from 'Secondary manager' to 'Zone manager')

code for zonemgr:/src/bin/zonemgr
code for notify-out:/src/lib/python/isc/notify
Also there are some other code change for module xfrin/xfrout and auth.

I have added two simple config items for xfrin: master_addr, master_port, (The default value for them is '127.0.0.1' and 53) which will lead the xfrin to do zone transfer from.


comment:2 in reply to: ↑ 1 Changed 9 years ago by zhanglikun

Link of design document for zone manager is:https://bind10.isc.org/wiki/SecondaryManager

comment:3 follow-ups: Changed 9 years ago by stephen

  • Owner changed from stephen to zhanglikun

This is the combined review for #215 and this ticket.

Branch taken at revision 2531
File list obtained via svn diff -r 2531:HEAD --summarize. Files not mentioned here have been checked and have raised no comment.

src/lib/xfr/xfrout_client.cc
The comment at line 83 should perhaps be a "TODO" rather than "XXX".

Remark: Not part of this change, but line 75:

const uint8_t lenbuf[2] = { msg_len >> 8, msg_len & 0xff };

is a bit obscure without a comment to indicate that it is converting a 16-bit word to network byte order.

src/lib/python/isc/datasrc/sqlite3_ds.py
OK, although the header comment for "get_zone_datas" is misleading - it doesn't return all records, it is a generator and successive calls are required to get all the information.

src/lib/python/isc/notify/tests/notify_out_test.py
Standard ISC file header comments are missing.

test_handle_notify_reply
It would be useful to comment what flags are set in the binary data (and so what is being tested).

test_get_notify_slaves_from_ns
The IPV6 addresses in the test zone data are in the wrong format. (It should be something along the lines of "4:4:4:4::".) Although this makes no difference in this test, it will partially isolate the test from any changes in _get_notify_slaves_from_ns() should that function ever be changed to check the format of the address data.

test_init_notify_out
Same comment as above concerning IPV6 addresses.

src/lib/python/isc/notify/notify_out.py
Standard ISC file header comments are missing.

dispatcher
Would benefit from a documentation string describing the purpose of the function.

The function sleeps for 0.5 section with a comment: "A better time?". The interval should be a symbolic constant, and this question should be flagged (with a "TODO") for the future.

The check in the loop for not_replied_zones should be "<=" instead of "<" so that a timeout happening at the current time is processed. (Although this is just being pedantic as it is likely to make minimal difference to the operation of the function.)

Class ZoneNotifyInfo
Comment is wrong - there is no variable called "timeout_".

The data hiding policy is not clear: for example, the get_socket() method is supplied which returns the socket associated with the zone. But the _init_notify_out method of the NotifyOut class accesses the _notify_slaves member of this class directly.

ZoneNotifyInfo.__init__
Python coding guidelines suggest it is generally better to append an underscore to an argument where its name clashes with a reserved keyword instead of using a spelling corruption (i.e. prefer "class_" to "klass").

Inline comments in the __init__ section would serve to document all the member variables. (e.g. notify_timeout appears to be the time at which the notification times out, yet in prepare_notify_out() it is set to the current time).

Class NotifyOut
There are no comments describing the purpose of this class.

NotifyOut.get_notify_slaves_from_ns
Comments should be provided to state what this method is doing.

Remark: the list of slaves/masters for each zone seems to be something that could be in the database along with the zone information.

NotifyOut.send_notify
Comments are needed. The function name suggests that it sends a notify message whereas it just adds a zone onto the appropriate queue.

_wait_for_notify_reply
Additional comments describing the operation of the function would be appreciated.

I'm not clear about the logic concerning the determination of block_timeout. min_timeout is initialized to the current time, then in the subsequent "for" loop is either left alone or made smaller. block_timeout is then set equal to "min_timeout - <new current time>". This will always be negative (or zero depending on clock resolution), and so the subsequent "if" check will always set block_timeout to 0.

_zone_notify_handler
Would benefit from some comments describing the processing.

Is it possible for this function to be invoked with event_type = EVENT_READ, but for the call to _get_notify_reply to return null. If so, what should happen?

If the function successfully received a notify reply, the next target is set in the zone_notify_info block and send_message_udp called. However if the maximum number of tries is exceeded, the next target is set in the zone_notify_info block but send_message_udp is not called for it. Is this correct?

_create_rrset_from_db_record
Array indexes should be symbolic.

The class is explicitly set to "IN"; should the class be taken from the class of the containing zone instead?

_handle_notify_reply
I would suggest more stringent checks on the returned message; at least that the qname matches and that the qid is the same as the one sent.

src/bin/auth/auth_srv.cc
With very similar names (AuthSrv and AuthSrvImpl), it might be clearer to put the declaration and implementation of AuthSrvImpl into separate files.

The AuthSrv setXxxx and getXxxx functions are sufficiently trivial that consideration could be given to defining them inline in the header file.

Check with Jerry Chen about recent (9 August 2010) changes to this file (in processAxfrQuery) concerning problems with AXFRs failing because of connection problems.

src/bin/auth/auth_srv.h
A header describing the purpose of the class would be useful.

Remark: all methods (not just ones changed by this update) should have a doxygen header.

setXfrinSession: looks as if the last line of the header comment is missing.

src/bin/auth/asio_link.cc
The asio_link.h file contains the class definitions for the wrapper around ASIO. However, the file asio_link.cc contains additional classes (such as TCPClient) that know about the application. I suggest that asio_link.cc contain only those classes and functions relating to the wrapper interface and that the other classes be moved to a separate file. This will make it easier for asio_link to be used elsewhere.

Header documentation for the helper classes (e.g. what is their purpose and the purpose of their methods) should be provided.

src/bin/auth/asio_link.h
OK. Documentation is good.

src/bin/zonemgr/tests/zonemgr_test.py
Should include test zones for classes other than IN.

TestZoneMgrRefreshInfo.test_random_jitter
Should put the second part of the test in a loop and run it a number of times. Assuming that the _random_jitter() method is a black box (and may be altered at some time in the future), as the jitter is supposed to be random there is always the possibility that the number returned just happens to lie in the specified limits, but that another call would lie outside them. Putting the test in a loop (say 100 iterations) would not guarantee the code is OK, but would increase confidence that it is.

TestZoneMgrRefreshInfo.test_set_zone_timer
TestZoneMgrRefreshInfo.test_set_zone_refresh_timer
TestZoneMgrRefreshInfo.test_set_zone_retry_timer
The value zone_timeout is explicitly cast to "float" in the first of these methods, but not in the other two.

TestZoneMgrRefreshInfo.test_zone_not_exist
TestZoneMgrRefreshInfo.test_get_zone_soa_rdata
These are cases where a check should be made for a zone with the same name as ones that exist but a different class

TestZoneMgrRefreshInfo.test_get_zone_state
Contains two identical assertions.

TestZoneMgrRefreshInfo.test_do_refresh
(4 lines from the end of the test) - variable "refrsh_timeout" is set, but the following assertions check the value of the (ealier-defined) variable "refresh _timeout".

TestZoneMgrRefreshInfo.test_run_timer
Some comments would be useful about what is expected; it took a little time to work out the sequence of events.

src/bin/zonemgr/zonemgr.py.in

ZoneMgrRefreshInfo._random_jitter()
Needs more documentation. The internal comment says "imposes some random jitters", but the intention seems to be to come up with a value of no more than "max"; an equally valid alternative would be to come up with a value in the region of max +/- jitter.

The expression:

max - random.normalvariate(max, jitter) % jitter

... seems overly complex. (Also I'm not sure what sort of distribution this results in.) Why not the simpler

random.uniform(max - jitter, max)

ZoneMgrRefreshInfo._set_zone_refresh_timer
ZoneMgrRefreshInfo._set_zone_retry_timer
Suggest that the offsets of fields in the SOA RDATA are specified symbolically rather than by integers.

ZoneMgrRefreshInfo._find_need_do_refresh_zone
The comment "# If hasn't received refresh response within refresh timeout, skip the zone" appear to be incorrect. Something like "# If hasn't received refresh response but are within refresh timeout, skip the zone".

ZoneMgrRefreshInfo._do_refresh
The class is named ZoneMgrRefreshInfo, which suggests that it is only concerned with the management of the refresh information. This method actually performs the refresh; perhaps the class should be renamed ZoneMgrRefresh?

Class Zonemgr
Could do with some more comments, e.g. just putting in some context of what the Zonemgr class is and how it is called, what the various commands are supposed to do etc. (e.g. shutdown is a command issued by a user, but Xfrin fail is issued by another component).

Inconsistency in capitalization of the word "Zonemgr" - the three classes in the file are ZonemgrException, ZoneMgrRefreshInfo and Zonemgr. It looks as if ZoneMgrRefreshInfo should be called ZonemgrRefreshInfo.

The list of zones to monitor come from the database file and is constructed when ZoneMgrRefreshInfo (via a call to _build_zonemgr_refresh_info). ZoneMgrRefreshInfo is created when Zonemgr is created. How does the list of zones get updated (e.g. if zones are added to or removed from the database)?

Although the --verbose flag is supported (and tested for), there does not seem to be any use of it.

src/bin/bind10/bind10.py.in
Remark: the section around this change (lines 350 - 430) start the processes in sequence. If there is a problem, all previous processes are killed via an explicit set of xxx.process.kill() statements. So adding a new process into the list means modifying code further down the module to make sure it is killed if a subsequent process fails to start. As each process starts, can't it be added to a list and a single function defined to kill all processes in that list?

comment:4 in reply to: ↑ 3 Changed 9 years ago by zhanglikun

Hi Stephen, this is the reply for your review on the following files:

src/lib/xfr/xfrout_client.cc
src/lib/python/isc/datasrc/sqlite3_ds.py
src/lib/python/isc/notify/tests/notify_out_test.py
src/lib/python/isc/notify/notify_out.py

What I didn't mentioned here has been changed according your suggestion.

NotifyOut?.get_notify_slaves_from_ns
Remark: the list of slaves/masters for each zone seems to be something that could >be in the database along with the zone information.

If the datasource API will provide such feature, that's great.

_wait_for_notify_reply
I'm not clear about the logic concerning the determination of block_timeout. >min_timeout is initialized to the current time, then in the subsequent "for" loop >is either left alone or made smaller. block_timeout is then set equal >to "min_timeout - <new current time>". This will always be negative (or zero >depending on clock resolution), and so the subsequent "if" check will always set >block_timeout to 0.

Woo, you have found one bug, sorry for my test case not cover it, I have moved the code for get the min_timeout to one single function, and add one test case to cover it, thanks for your finding.

_zone_notify_handler
Is it possible for this function to be invoked with event_type = EVENT_READ, but >for the call to _get_notify_reply to return null. If so, what should happen?

yes, it's possible when the socket read reply packet with error happened. If so, the notify message will resend again.

If the function successfully received a notify reply, the next target is set in the >zone_notify_info block and send_message_udp called. However if the maximum number >of tries is exceeded, the next target is set in the zone_notify_info block but >send_message_udp is not called for it. Is this correct?

yes, send_message_udp() is not called this time, it will be called in the next loop of dispatcher().

comment:5 in reply to: ↑ 3 Changed 9 years ago by zzchen_pku

And this is the reply for your review on the zonemgr module:
src/bin/zonemgr/tests/zonemgr_test.py
src/bin/zonemgr/zonemgr.py.in

The list of zones to monitor come from the database file and is constructed when ZoneMgrRefreshInfo (via a call to _build_zonemgr_refresh_info). ZoneMgrRefreshInfo is created when Zonemgr is created. How does the list of zones get updated (e.g. if zones are added to or removed from the database)?

Because the datasource API doesn't provide such feature now, I'll add it into TODO list.

For other opinions,I have modified all of them according to your suggestions.
Thank you for your carefully review.

comment:6 Changed 9 years ago by zhanglikun

A new ticket #301 is created for the comments to the following files:
src/bin/auth/auth_srv.cc
src/bin/auth/auth_srv.h
src/bin/auth/asio_link.cc
src/bin/auth/asio_link.h

A new ticket #300 is created for the comments to the file:
src/bin/bind10/bind10.py.in

comment:7 Changed 9 years ago by zhanglikun

Hi Stephen, do you plan to do some review on the latest changement, or else, the code will be merged to trunk.

comment:8 Changed 9 years ago by zhanglikun

  • Owner changed from zhanglikun to stephen

comment:9 Changed 9 years ago by stephen

  • Owner changed from stephen to zhanglikun

All OK, please go ahead and merge.

comment:10 Changed 9 years ago by zhanglikun

  • Resolution set to fixed
  • Status changed from reviewing to closed

The code has been merged to trunk, so close this one.

Note: See TracTickets for help on using tickets.